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1  SUMMARY 


This  report  briefly  describes  the  research  carried  out  by  faculty  and  grad¬ 
uate  students  of  the  Department  of  Electrical  Engineering  at  the  University 
Minnesota  under  grant  AFOSR  AF/F49620-92-J-0134.  The  principal  investiga¬ 
tor  for  this  research  was  Prof.  Ahmed  H.  Tewfik.  The  period  covered  in  this 
report  is  February  1,  1992  to  January  31,  1993. 

The  basic  goal  of  this  grant  was  to  study  the  role  that  wavelet  theory  crm 
play  in  information  representation  and  extraction.  We  focused  our  attention 
primarily  on  surveillance  applications.  As  part  of  our  research,  we  studied  two 
problems  that  arise  in  surveillance.  The  first  problem  was  that  of  determining 
the  directions  of  arrivals  of  a  set  of  plane  waves  in  the  presence  of  a  background 
noise  of  unknown  correlarion  structure.  The  second  problem  involved  selecting 
an  optimal  set  of  N  waveforms,  with  A'  fixed,  to  obtain  the  best  reconstruction 
of  a  distributed  range-Doppler  target  reflectivity  function 

We  established  that  the  correlation  structures  of  the  wavelet  transforms  of 
the  outputs  of  an  array  of  sensors  due  to  plane  waves  and  these  due  to  wide 
classes  of  background  noise  are  different.  We  constructed  a  method  that  exploits 
these  differences  to  estimate  the  directions  of  arrival  of  the  plane  waves.  We  also 
demonstrated  the  feasibility  of  this  approach  and  its  superiority  over  traditional 
approaches  using  numerical  simulations. 

We  showed  that  the  most  accurate  reconstruction  of  a  range-Doppler  target 
density  that  can  be  computed  from  N  waveforms  and  their  echoes  is  obtained 
by  transmitting  the  singular  functions  corresponding  to  the  N  largest  singular 
values  of  two  kernels  derived  from  the  target  density.  The  singular  functions  are 
valid  wavelets  that  obey  an  additional  orthogonality  constraint  in  the  frequency 
domain.  Using  this  result,  we  proposed  a  solution  to  the  problem  of  choosing 
a  set  of  A^  waveforms  to  reconstruct  with  high  accuracy  an  arbitrary  unknown 
target  range-Doppler  density  function. 

We  also  studied  two  signal  representation  problems.  Specifically,  we  investi¬ 
gated  the  completeness  of  an  arbitrary  sampling  of  a  redundant  dyadic  wavelet 
transform.  We  gave  a  necessary  and  sufficient  condition  for  the  completeness  of 
any  such  representation  of  any  discrete  time  or  space  finite  data  length  signal  (in¬ 
cluding  dyadic  wavelet  transform  extrema  and  zero-crossings  representations). 
We  showed  that  completeness  depends  only  on  the  locations  of  the  retained  sam¬ 
ples  of  the  dyadic  wavelet  transform.  Our  completeness  test  is  more  convenient 
and  easier  to  verify  than  previously  derived  tests.  Furthermore,  we  explained 
why  some  conclusions  reported  in  the  literature  hold  for  most  signals  except  for 
some  extreme  cases.  W’e  showed  how  to  ensure  the  completeness  of  the  repre¬ 
sentation  by  adding  additional  information  in  those  cases  where  the  sampled 
dyadic  wavelet  transform  domain  representation  is  incomplete.  We  also  studied 
the  numerical  stab.li  y  of  such  a  representation.  The  stability  issue  is  important 
in  the  sense  that  a  numerically  unstable  representation  is  useless  from  a  practical 
point  of  view.  We  described  a  fast  fast  Fourier  transform  (FFT)  based  recon- 
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struction  algorithm  from  such  a  signal  representation.  This  work  is  important 
in  that  it  provides  us  with  the  theoretical  tools  that  we  need  to  further  study 
efficient  and  highly  adaptive  signal  representations.  Such  representations  can 
lead  to  fast  processing  algorithms  and  more  powerful  signal  coding  algorithms. 

Finally,  we  studied  another  wavelet  based  signal  representation  procedure 
Our  aim  was  to  understand  how  discrete  orthogonal  non-redundant  wavelet 
representations  (as  opposed  to  the  redundant  dyadic  representations  mentioned 
above)  can  be  adapted  to  a  given  problem  by  minimizing  a  cost  function  (e  g  , 
bit  rate  in  coding  or  number  oi  flops  required  to  implement  a  detector  in  radar 
detection)  subject  to  satisfying  a  second  constraint  (e  g.,  the  quality  of  the  en¬ 
coded  waveform  or  a  fixed  probability  of  detection  and  a  maximal  probability  of 
fdse  alarm).  We  chose  to  study  this  problem  in  the  context  of  high  quality  low 
bit  rate  audio  coding.  Our  results  extend  however  to  other  similar  problems, 
e.g.,  the  design  of  near-optimal  radar  waveform  detectors  of  minimal  complex¬ 
ity.  We  showed  that  the  regularity  property  of  wavelets  is  important  in  coding 
applications.  We  also  established  that  adapting  the  selection  of  the  analysis 
wavelet  to  the  underlying  signal  can  lead  to  great  reductions  in  bit  rate.  Our 
main  contribution  was  to  construct  a  proceduie  that  exploits  the  masking  effect 
in  human  hearing  by  properly  choosing  the  analysis  wavelet  in  a  dynamic  and 
adaptive  manner  Our  procedure  has  lead  to  a  high  quality  audio  compression 
procedure  that,  for  a  given  quality,  can  achieve  lower  bit  rates  than  the  MPEG 
audio  standard. 

In  the  following  section,  we  present  a  brief  introduction  to  wavelet  theory. 
Next,  we  summarize  the  results  that  we  have  obtained.  A  list  of  publications 
supported  in  part  by  this  grant  is  also  included. 

We  are  continuing  our  research  in  the  areas  discussed  in  this  report  with 
funding  from  AFOSR  under  grant  AF/F49620-93-1-0151DEF. 
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2  WAVELET  TRANSFORMS 

2. A  Introduction 

Signal  transforms  have  become  a  powerful  tool  in  system  theory  and  sig¬ 
nal  processing.  In  a  typical  transform,  a  signal  of  interested  is  expressed  as  a 
weighted  superposition  of  a  countably  infinite  set  of  basis  functions  (“discrete 
transforms”)  or  as  a  weighted  integral  of  a  particular  function  (“continuous 
transforms”)  Examples  of  discrete  transforms  of  include  Fourier  series  expan¬ 
sions  of  periodic  functions  and  Karhunen-Loeve  representations  of  stochastic 
processes  over  a  finite  interval.  The  best  known  continuous  transform  is  of 
course  the  continuous  time  Fourier  transform. 

While  known  transforms  are  extremely  useful  in  various  applications  (e  g  . 
Fourier  transforms  are  the  basis  of  system  and  modulation  theories,  Karhunen- 
Loeve  expansions  are  used  in  pattern  recognition  and  detection  and  estimation 
theories)  they  do  suffer  from  a  number  of  disadvantages  when  applied  to  cer¬ 
tain  problems  For  example,  the  computation  of  a  Karhunen-Loeve  expansion  is 
an  expensive  operation  as  it  involves  solving  an  eigenvalue-eigenfunction  (vec¬ 
tor)  problem.  Furthermore,  most  transforms  yield  information  about  the  sigual 
which  is  not  localized  in  time,  e.g.  the  Fourier  transform  or  coefficient  of  a  sig¬ 
nal  does  depend  on  the  value  of  the  signal  over  its  entire  support.  This  implies 
that  coefficients  may  be  wasted  to  represent  the  signal  over  intervals  where  it  is 
identically  zero,  transforms  change  if  the  support  of  the  signal  changes  or  more 
data  is  acquired  and  it  is  difficult  to  relate  the  local  behavior  of  a  signal  to  its 
transform 

To  address  the  above  mentioned  drawbacks,  researchers  have  recently  pro¬ 
posed  the  wavelet  transform  as  a  fast  technique  for  studying  the  local  behavior 
of  a  signal'.  The  idea  behind  this  new  transform  is  that  if  one  wants  to  study 
the  local  behavior  of  a  signal,  one  has  to  implicitly  window  the  signal  and  focus 
on  the  resulting  signal  slice.  Short  windows  will  of  course  lead  to  high  resolution 
in  the  time  domain  and  lower  resolution  in  the  frequency  domain.  On  the  other 
hand,  long  windows  provide  high  resolution  in  the  frequency  domain  and  low 
resolution  in  the  time  domain.  To  gain  a  certain  degree  of  freedom  in  trading 
time  versus  frequency  resolution  one  may  then  want  to  use  windows  of  different 
support  lengths.  However,  such  windows  should  be  properly  constructed  to  en¬ 
able  the  user  to  relate  the  results  obtained  with  the  various  analysis  windows. 
The  solution  adopted  in  wavelet  transforms  is  to  use  dilates  of  a  single  properly 
constructed  window. 

This  section  is  organized  as  follows.  We  begin  with  a  review  of  continuous 
wavelet  transforms.  The  focus  in  that  sub-section  is  on  the  ability  of  the  wavelet 
transform  to  trade  time  and  frequency  resolutions  in  a  specified  manner.  This 
property  of  the  wavelet  transform  is  emphasized  in  applications  that  require 

•Although  the  formalization  of  the  wavelet  transform  and  the  study  of  its  properties  is 
new,  the  idea  behind  it  is  not  new.  See,  e.g.  [9],  [10], 
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time-frequency  representations  of  an  underlying  signal.  We  then  shift  our  at¬ 
tention  to  discrete  orthogonal  wavelet  transforms.  This  class  of  transforms  is 
non-redundant  and  has  been  found  to  be  most  useful  in  signal  processing  ap¬ 
plications.  In  particular,  we  will  discuss  the  construction  of  discrete  orthogonal 
wavelets  and  the  computation  of  the  discrete  orthogonal  wavelet  transform  of  a 
given  signal.  We  will  emphasize  in  that  part  of  Section  2  the  issue  of  regularity 
of  the  analyzing  wavelet  and  the  effect  of  that  regularity  on  the  structure  of  the 
wavelet  transform  of  broad  classes  of  signals. 


2.B  Continuous  Wavelet  Transform 


In  a  continuous  wavelet  transform,  one  attempts  to  express  the  signal  /(<)  in 
terms  of  translates  of  dilates  of  a  single  function  where  the  weight  given  to 
each  translate  and  dilate  is  proportional  to  the  inner  product  between  f{t)  and 
that  particular  translate  or  dilate.  Assuming  for  the  moment  that  f{t)  has  finite 
energy  and  that  V’(t)  is  a  suitable  wavelet  function  then  /(<)  can  be  written  as 
[25],  [37] 

/(O  =  7^  /  /  VsF(s,u)il;(s{(  -  ■u))duds  {2.B.1) 

Joo  Jo 

where  C^.  is  a  finite  constant  and  F{s,u)  is  the  wavelet  transform  of  f[t)  and 
is  given  by 


^s,u)  =  y/s  f 


/(l)v(s(<  -  u))di. 


(2.B.2) 


The  variable  s  in  the  above  equations  is  the  "scale"  variable  because  it  controls 
the  effective  width  of  the  support  of  The  variable  “u”  has  the  dimension 
of  time  and  gives  the  amount  by  which  4’{si)  has  been  translated  in  the  time 
domain. 

Since  (2.B.1)  must  hold  for  any  finite  energy  signal  /(/),  i’(<)  cannot  be  an 
arbitrary  function.  By  taking  the  Fourier  transform  of  both  sides  of  (2.B.1)  it 
becomes  clear  that  we  must  have 


Wf)]' 


ds  >  0 


(2.B.3) 


where  ^'(w)  denotes  the  Fourier  transform  of  t/'Cf)  (Otherwise  we  may  not  be 
able  to  represent  functions  that  have  energies  at  frequencies  where  the  integral 
in  the  above  equation  is  zero.)  In  fact,  the  wavelet  is  chosen  such  that 


0<  C, 


‘  =  / 


I'Kf)!' 


ds  <  oo. 


(2.B.4) 


Note  that  by  making  a  change  of  variable  of  integration  in  (3.4)  we  may  also 
express  as 

Q.  =  /  (2.B.5) 
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Eqs.  (3.4)  and  (2.B.5)  implies  that  4{u))  is  zero  at  w  =  0  and  must  decay 
fast  enough  as  u  tends  to  0.  This  is  condition  is  intuitively  pleasing;  the  only 
frequency  that  is  not  affected  by  division  by  the  scale  s  is  w  =  0.  If  '^(0)  ^  0 
then  all  dilates  of  would  contribute  to  DC  (w  =  0)  leading  to  an  infinite 
concentration  of  energy  at  w  =  0. 

By  using  (2.B.5)  one  can  also  show  [25]  that  the  wavelet  transform  is  energy 
preserving,  i.e.  one  can  establish  the  following  “Parseval  Theorem”  like  result 

r  r  iF(s,u)i*dsdu=cv,  r  \m\^dt.  {2.B.6) 

Jo  J— 00  J  — 00 


Equation  (2.B.5)  provides  a  “recipe”  for  constructing  wavelet  functions  xl’(t). 
Specifically,  to  construct  a  wavelet  t/>(t)  one  can  start  with  any  finite  energy 
function  ^(f)  such  that  its  Fourier  transform  0(u))  is  bounded  and  decays  to 
zero  faster  than  a’P'*'*  for  some  integer  p  as  u  tends  to  infinity.  One  can  then 
take  4'(w)  =  (jij)P4>(ui).  This  is  illustrated  in  Fig.  1  where  we  have  chosen 
4>(i)  to  be  a  zero-mean  Gaussian  function  of  variance  equal  to  unity  and  the 
corresponding  is  taken  to  be  the  second  derivative  of  the  Gaussian,  a  choice 
that  is  popular  in  the  computer  vision  literature. 

Note  that  wavelets  constructed  using  the  above  recipe  will  have  p  zeros  at 
w  =  0  and  hence  will  have  p  vanishing  moments,  i.e., 

f  =  0,  m  =  0,1,2,-  -.p- 1.  (2.B.7) 

J  — OC 

This  in  turn  implies  that  if  the  signal  /(<)  is  smooth  enough  then 

lF(s,u)|<C—  (2.B.8) 


for  some  finite  constant  C  that  depends  on  xl'{t)  and  f{t).  Thus,  for  such  smooth 
signals,  most  of  the  energy  in  F(s,u)  will  appear  at  lower  scales. 

Now  let  us  assume  wunout  loss  of  generality  that  the  support  of  t^(t)  is 
centered  around  the  origin  Denote  by  erf  the  variance  of  c(<)  in  the  time 
domain,  i.e. 


r 


STo.r-(t}dt 


(2.B,9) 


Furthermore,  assuming  that  xp{t)  is  real  we  have  l’J'(u^)i  =  I’l'l— u')|  Denote  by 
w  and  the  center  of  the  pass-  band  of  the  Fourier  transform  of  V’(0 
the  frequency  domain  and  its  variance  around  w  ,  i.e. 


w|»(w)pdw 

/o'"  |^r(w)(2d,^ 

/J" 


(2.B.10) 

(2.B.11) 


/ 


Clearly,  F{s,u)  is  mainly  affected  by  the  behavior  of  /(f)  in  the  time  interval 
(ii  -  20x1  s,  u  -I-  2<T|/s].  On  the  other  hand,  by  Parseval's  theorem,  f(s,  u)  may 
also  be  rewritten  as 

F(s,u)  =  -^J (2  B.12) 

where  T{w)  denotes  the  Fourier  transform  of  /(f).  Hence,  F(s,  u)  is  also  mainly 
affected  by  the  behavior  of  F{u!)  in  the  time  interval  {s(Ii7  —  2au/).s('Z  +  2a^)]. 
In  particular,  for  large  values  of  s,  f  (s,ti)  carries  information  about  /(f)  that  is 
essentially  localized  in  the  time  domain  whereas  for  small  values  of  s,  it  carries 
information  about  F{u)  that  is  localized  in  the  frequency  domain. 

Observe  also  that  the  wavelet  transform  (2.B.1)  is  effectively  a  mapping  from 
1-  D  functions  to  2-D  functions.  Since  there  is  no  one-to-one  correspondence 
between  all  1-  D  finite  energy  and  all  2-D  finite  energy  functions,  we  should 
expect  F{s,u)  to  be  redundant,  i.e.  to  actually  be  in  a  subspace  of  all  finite 
energy  2-D  functions.  This  turns  out  to  be  the  case.  In  particular  it  may  be 
shown  that  any  valid  wavelet  transform  F{s,u)  must  be  left  invariant  by  the 
application  of  a  particular  operator  with  kernel  A'(s,  s',  u,  u'),  eg. 


-'-rr 


F(s.  u)K{s.  s':  u.  ii')duds 


(2  B  13) 


where  the  non-invertible  operator  A'(s,s';  u,  u')  is  given  by 


1  yx  ___ 

A'(s, s', u.  u')  =  —  /  dt\/7Pi(s'(i  -  -  u)).  {2.B.14) 

^  v  Joe 

While  in  general  signal  processing  applications  this  redundancy  may  not  be 
desirable,  it  may  be  useful  in  certain  applications. 

In  the  next  subsection  we  shall  derive  a  non-redundaut  wavelet  transform  by- 
sampling  F(s.y)  on  an  appropriate  grid  for  a  more  restricted  class  of  wavelets 
The  advantages  of  using  the  redundant  transform  developed  here  over 
the  non-redundant  one  are  that  one  can  use  wavelets  from  a  much  wider  class 
and  ti'.at  one  can  tolerate  larger  quantization  or  round-off  errors  in  representing 
F(s.  V  !  while  guaranteeing  that  the  reconstructed  signal  /(()  has  a  relative  error 
of  norm  smaller  than  that  in  the  quantized  version  of  f  (s,  u).  A  complete  theory 
of  redundant  representation  similar  to  the  continuous  transform  given  here  is 
described  in  [18]. 

Fig.  2.b  illustrates  the  wavelet  transform  of  the  function  f(i)  given  in  Fig. 
2. a  computed  with  respect  to  a  wavelet  with  five  vanishing  moments  Note  that 
the  local  extrema  of  the  wavelet  transform  at  the  various  scales  correspond  to 
points  of  discontinuity  of  the  function  f(i)  or  of  its  derivative.  This  can  be  easily- 
explained  as  follows.  Let  us  assume  for  simplicity  that  the  support  of  v{t)  is 
finite  and  that  rl{t)  has  p  vanishing  moments.  Let  us  also  assume  that  away 
from  its  points  of  discontinuity  the  signal  /(<)  is  a  piece-wise  smooth  function. 
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In  particular,  we  assume  that  we  can  expand  f{t)  in  the  neighborhood  of  t  —  uq 
between  two  points  of  discontinuities  of  /(t)  in  ap-  term  Taylor  series  expansion 
Now  note  that  for  large  values  of  the  scale  parameter  “s”  the  support  of 
will  be  small  and  F{s,u)  will  be  determined  by  the  values  of  f{i)  around  the 
point  uq.  In  particular,  if  we  substitute  the  p-term  Taylor  series  expansion  of 
/(<)  in  the  integral  that  defines  F{s,  u)  and  use  the  fact  that  Vff)  has  p  vanishing 
moments  we  find  that  |F(s,u)|  will  decay  at  the  rate  of  l/s**  in  the  vicinity  of 
u  =  tio  On  the  other  hand,  if  the  neighborhood  of  size  equal  to  the  support  of 
tt'(st)  around  u  =  uq  contains  points  where  /(<)  is  not  smooth  enough  then  this 
rate  of  decay  will  not  hold. 

The  fact  that  extrema  of  wavelet  transforms  correspond  to  points  of  discon¬ 
tinuity  of  /(f)  has  been  used  in  [38]  to  register  signals  It  is  also  shown  exper¬ 
imentally  in  that  reference  that  certain  types  of  signals  can  be  reconstructed 
from  the  extrema  of  their  wavelet  transforms.  We  will  discuss  this  point  further 
in  the  next  section. 

2.C  Sampling  the  Wavelet  Transform 

As  mentioned  above,  a  valid  wavelet  transform  F{s,  u)  is  left  invariant  by  the 
application  of  the  kernel  K(s,s'.u,u').  This  is  similar  to  bandlimited  signals 
which  are  also  left  invariant  by  ideal  low  pass  filtering 

The  redundancy  present  in  a  bandlimited  signal  can  be  exploited  to  obtain 
more  efficient  representations  of  the  signal  [72].  For  example,  a  familiar  result 
is  the  a  bandlimited  signal  is  uniquely  determined  by  its  samples  taken  on 
a  suitable  grid.  Similarly,  it  is  by  now  well  known  that  almost  all  bandpass 
signals  of  bandwidth  less  than  one  octave  can  also  be  constructed  from  their 
zero-crossings  [35]. 

It  is  natural  to  ask  whether  one  can  exploit  the  redundancy  in  a  wavelet 
transform  in  order  to  derive  more  efficient  transforms.  The  answer  to  this 
question  turns  out  to  be  yes.  We  will  discuss  one  such  possibility  in  this  section 
based  on  tiie  extrema  of  the  wavelet  transform.  In  the  next  section,  we  discuss 
another  non-redundant  representation  that  is  based  on  a  judicious  sampling  of 
the  continuous  wavelet  transform. 

2.C.1  Reconstruction  of  a  Wavelet  Transforms  from  Its  Extrema 

As  mentioned  above,  Logan  [35]  has  shown  that  any  signal  f(t)  that  does 
not  share  any  zero  crossings  with  its  Hilbert  transform  can  be  reconstructed 
from  its  zero  crossings.  Now  recall  that  the  wavelet  transform  of  a  signal  f{i) 
at  a  given  scale  fixed  scale  s  may  be  interpreted  as  the  output  of  a  filter  of 
impulse  response  v(  —  si)  driven  by  f{t).  It  then  follows  from  Logan's  results 
that  if  ^(-')  is  chosen  to  be  non-zero  only  over  the  interval  n  <  ju,]  <  27r  for 
example.  F{s,u)  can  be  reconstructed  at  each  scale  s  from  its  zero  crossings 
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CO 


Figure  2:  Example  of  a  continuous  wavelet  decomposition.  The  original  signal 
is  shown  in  Fig.  2. a.  The  wavelet  transform  at  various  scales  is  shown  in  Figs. 
2.b-2.f.  Fig  2.b  corresponds  to  the  finest  scale  and  2.f  to  the  coarsest  scale 
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Unfortunately  the  above  characterization  is  unstable  in  that  a  small  error 
in  the  actual  location  of  the  zero  crossings  may  lead  to  large  errors  in  the 
reconstructed  function  F(s,u).  Furthermore,  it  is  difficult  to  relate  the  location 
of  the  zero  crossings  of  F(s,  u)  to  the  local  behavior  of  /(()  when  is  chosen 
as  above. 

The  latter  difficulty  may  be  circumvented  by  picking  wavelets  that  have  a 
large  ntimber  of  vanishing  moments.  It  is  then  possible  to  relate  the  rate  of  de¬ 
cay  of  F{s,u)  to  the  local  behavior  of  /(<)  (and  in  particular  its  local  regularity 
as  captured  by  its  local  Lipschitz  exponent  [28]).  To  stabilize  the  reconstruction 
several  researchers  in  computer  vision  have  suggested  using  additional  informa¬ 
tion  such  as  the  gradie.it  of  the  wavelet  transform  [27]  or  the  value  of  the  integral 
of  the  wavelet  transform  between  two  consecutive  crossings  [38]. 

Motivated  by  the  fact  that  in  practice  only  samples  of  the  signal  /{<)  are 
available  for  processing,  Mallat  [38]  has  also  formulated  a  discrete  signal  re¬ 
construction  problem  in  which  the  goal  is  to  reconstruct  an  N  =  2"^  point 
discrete  time  sequence  from  the  extrema  of  its  wavelet  transform  at  scales  2-’ . 
j  =  0, 1,  ■  •  • ,  J  —  1  and  its  approximation  at  scale  1.  (The  given  partial  wavelet 
information  is  not  necessarily  complete,  i.e.  it  need  not  correspond  to  a  unique 
signal.)  An  iterative  algorithm  is  given  in  that  same  reference  for  solving  this 
problem.  The  algorithm  performs  a  sequence  of  alternating  projections  onto  the 
set  of  valid  wavelet  transforms  and  that  of  2-D  functions  that  have  the  given 
extrema.  It  was  shown  to  converge  experimentally  but  no  theoretical  proof  of 
its  convergence  was  given. 

A  second  iterative  algorithm  for  solving  this  problem  was  presented  in  [11]. 
This  algorithm  differs  from  that  of  [38]  in  its  choice  of  the  sets  onto  which 
projections  are  performed.  In  particular,  the  sets  it  uses  are  closed  and  convex 
a  property  that  guarantees  the  convergence  of  the  algorithm  We  will  discuss 
this  problem  further  in  Section  3.C  where  we  describe  our  research  in  the  area 
of  sampling  dyadic  wavelet  transforms. 

The  practical  importance  of  the  results  described  above  is  that  they  pave 
the  way  for  the  development  of  novel  techniques  for  separating  mixed  signals 
Specifically,  it  may  be  possible  to  separate  two  signals  from  an  observation  of 
their  sum  as  long  as  the  two  signals  have  different  local  Lipschitz  behavior. 
This  may  be  done  by  computing  a  wavelet  transform  of  their  sum  and  then 
attempting  to  estimate  the  locations  and  magnitudes  of  the  extrema  of  the 
wavelet  transforms  of  the  signals  using  the  additional  a  priori  knowledge  about 
their  local  Lipschitz  properties.  Several  studies  that  are  based  on  variations  of 
this  idea  are  currently  being  pursued  by  different  research  groups 

2.D  Discrete  Orthogonal  Wavelet  Orthogonal  Transform 

A  second  approach  to  eliminate  the  redundancy  in  the  continuous  wavelet 
transform  consists  in  sampling  that  transform.  In  [17],  Daubechies  proposed  to 
sample  the  scale  parameter  s  on  a  grid  According  to  the  discus.sion 
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in  Section  3.B,  the  sampled  transform  F(a^,u)  is  the  output  of  a  bandpass 
filter  centered  at  o^w  and  with  a  root  mean  square  bandwidth  of  a^a^.  Hence, 
we  should  be  able  to  reconstruct  f(o^,u)  viewed  as  a  function  of  u  from  its 
samples  taken  on  the  grid  {n/3/a^}^_oo  “  ss  0  is  chosen  appropriately. 
Daubechies  studied  the  problem  of  choosing  a  and  0  to  obtain  a  nomredundant 
and  complete  representation.  She  showed  that  certain  choices  of  lead  to 
a  non-redundant  and  yet  complete  representation. 

A  valid  choice  for  the  pair  {a,0)  that  has  received  considerable  attention 
is  (q,/?)  =  (2,1).  In  particular,  it  b  shown  in  [17]  that  any  square  integrable 
signal  admits  the  decomposition 


m 


bU\m) 


]  "» 


-  m)dt. 


(2.D.1) 

{2-D.2) 


where  is  called  a  discrete  orthogonal  wavelet.  The  term  “discrete”  refers 
to  the  fact  that  (2.D.1)  is  a  sampled  version  of  the  continuous  wavelet  trans¬ 
form.  The  term  “orthogonal”  refers  to  the  fact  that  is  constructed  to  be 
orthogonal  to  ah  translates  of  its  dilates  Orthogonality  of  the  translates 

and  dilates  of  ri/(t)  is  not  necessary  but  may  be  achieved  by  properly  construct¬ 
ing  t/’(<)  The  main  characteristic  of  the  discrete  orthogonal  wavelet  transform 
is  that  it  is  non-redundant,  i.e.,  any  sequence  {\/2^6(j;m)}  that  is  absolutely 
summable  is  a  valid  wavelet  coefficient  sequence. 


2.E  Construction  of  Discrete  Orthogonal  Wavelets 

The  wavelet  is  not  unique,  but  it  is  also  not  arbitrary.  It  must  satisfy 
certain  conditions  that  insure  that  the  expansion  in  (2.D.1)  holds  for  any  square 
integrable  function.  The  wavelet  function,  and  corresponding  scaling  function, 
can  be  constructed  to  be  compactly  supported  [17].  In  the  sequel,  we  assume 
that  the  wavelet  is  compactly  supported. 

The  construction  of  the  wavelet  is  based  on  the  solution  «>(<)  of  a  two  scale 
difference  equation  (a  dilation  equation) 

m  =  Y.^k<i>(2t-k)  (2.E.3) 

k 

where  <^(<)  is  normalized  so  that 

=  l  (2.E.4) 

and  =  0  outside  th^  interval  [0,  K  —  1].  This  normalization  implies  that 

K-l 

51  =  2.  (2.E.5) 

k^O 
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Denote  by  $(w)  the  Fourier  transforms  of  By  taking  the  Fourier  transform 
of  both  sides  of  the  dilation  equation  (2.E.3)  we  obtain 

4(2u;)  =  G{u)*{u)  (2.E.6) 

where 

(2.E.7) 

Equation  (2.E.6)  then  implies  that 

=  (2.E.8) 

(Incidentally,  it  is  interesting  to  note  also  that  the  values  of  4>{t)  for  all 
integers  in  [0,  K  -  1]  may  be  computed  by  solving  a  simple  eigenvalue  problem. 
Specifically,  it  may  be  shown  by  using  (2.E.3)  that  the  vector  that  consists  of 
samples  of  at  all  integers  in  (0,A'  -  1]  is  the  eigenvector  of  the  matrix 
G  =  [eji-t]  corresponding  to  its  unique  largest  eigenvalue  which  is  equal  to  one 
[55].  The  recursion 

then  determines  0(<)  at  all  dyadic  points  n/2^ .) 

The  orthogonal  wavelet  is  constructed  from  ^(0  as 

^■(t)  =  ^dt<^(2f-i-)  (2.E.9; 


where 


d*  =  (-l)*c,_t.  (2.E.10) 

To  insure  that  the  dilates  and  translates  of  the  tJ'(t)  are  orthogonal  we  require 
that 


K-\ 

^  CkCt-7m  =  26om  (2.E.11) 

where  6om  is  a  Kroneker  delta  function.  This  condition  also  insures  that  4>(2H  — 
m)  is  orthogonal  to  ip{2H-  n),j  <  m  as  long  as  G(w)  has  no  zero  in  the  interval 
[-pt73,7r/3]. 

Another  interesting  property  of  the  wavelet  decomposition  that  is  a  direct 
result  of  (2.E.9)-(2.E.11),  is  that  (2.D.1)  provides  a  multiresolution  decomposi¬ 
tion  of  f(1).  Specifically,  it  is  a  simple  matter  to  show  that  as  in  the  continuous 
wavelet  transform  case,  the  coefficients  carry  information  about  f(t) 
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near  the  frequency  Vu  and  the  time  instant  For  fixed  J ,  the  partial 

sum  —  m)  provides  an  “approximation”  to  /(<)  up 

to  scale  2*^.  The  approximation  is  essentially  a  low  pass  filtered  version  of  /(<) 
approximately  bandlimited  to  2’^».  Furthermore,  the  approximation  to  /(<)  up 
to  scale  2"'  can  be  written  in  terms  of  translates  of  ^(2‘'t),  i.e. 

jfs— 60  m 

The  difference  between  the  approximations  at  scales  2^  and  2'^‘*''  is  the  “detail" 
of  the  function  at  scale  2^  and  is  given  by  the  sum  Ylm  ^(7;  -  m).  In 

particular,  we  may  rewrite  (2.D.1)  as 


=  (2.E.12) 

m 

=  [  f{t)4>(2^t~m)di.{2.E.n) 

J^oo 


PC  _  OC  00 

m=  Y.  a(J ■,m)<t>{2U  -  -")+Z  E  '/Vh(j,m)t'{2H  -  77i). 

m=-oc  m=-oc 

(2.E.14) 


An  example  of  a  wavelet  which  satisfies  the  condition  (2.E.5)  and  {2.E.11) 
is  given  by  the  Haar  wavelet.  The  Haar  wavelet  corresponds  to  the  scaling 
function  that  satisfies 

/  1  0<t< 1 

^  \  0  otherwise 

The  wavelet  itself  is  then  equal  to 

f  1  0  <  t  <  1/2 

xl){t)  =  <  -1  l/2<  f  <  1  . 

I  0  otherwise 


Multiscale  analysis  using  Haar  wavelet  exhibits  good  time  localization  but 
frequency  localization  is  very  poor.  In  other  words,  in  the  decomposition  (2.D.1) 
the  coefficients  6(/;m)’s  decay  very  slowly  to  zero.  Hence  the  conditions  (2.E.5) 
and  (2. E  li)  are  not  enough  to  construct  useful  wavelets  because  they  may  lead 
to  wavelet  decompositions  with  not  enough  regularity.  When  /(<)  is  a  generic 
smooth  function  and  V’(<)  has  a  finite  support,  then  as  n  tends  to  infinity  the 
integral  f  f(i)t/.’(ni)  di  is  0(n"P~*)  if  and  only  if  the  first  p  moments  of  r/(<)  are 
zero,  i.e. 


=  0 


m  =  0,  l,  -,p  —  1. 


(2.E.15) 
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It  may  be  shown  that  imposing  the  additional  requirement  for  to  have  p 
vanishing  moments  is  equivalent  to 

K-l 

53 (-I)***"*;*  =  0  m  =  0,l,  --,p- 1.  (2.E.16) 

isO 

This  requirement  also  implies  that  for  any  smooth  function  /(t),  4>{i)  and  its 
translates  approximate  the  function  with  accuracy  2“^,  i.e. 

-  it)||  <  for  suitable  a*  (2.E.17) 

k 

where  a  is  independent  of  /(<).  Note  that  this  means  that  any  polynomial  of 
order  less  than  p  can  be  represented  exactly  using  translates  of  <i>(t)  as  long 
as  (2.E.16)  holds.  This  in  turn  implies  that  the  multiresolution  representation 
given  by  (2.E.14)  can  actually  be  used  to  decompose  spaces  larger  than  L^{V.) 
In  particular,  any  function  of  polynomial  growth  up  to  order  p  -  1  may  be 
represented  as  in  (2.E.14)  even  though  such  a  function  is  not  in 

Condition  (3.12)  is  an  essential  property  of  wavelet  decompositions  as  we 
will  see  in  this  section  which  discusses  applications  of  wavelet  analysis  in  signal 
processing  Fig.  3  •  Fig  4  show  examples  of  scaling  functions  and  wavelet 
functions  with  various  number  of  vanishing  moments. 

2.F  Computing  Discrete  Orthogonal  Wavelet  Decomposi¬ 
tions 

The  wavelet  decomposition  given  by  (2.E.14)  can  be  computed  recursively  in 
scale  space  [37].  The  procedure  of  [37]  is  based  on  the  fact  that  the  coefficients 
{cic/\/2}  and  {dic/y/?}  may  be  viewed  as  the  impulse  responses  of  a  pair  of  finite 
impulse  response  conjugate  quadrature  mirror  filters.  Note  that  (2.E  3)  implies 

it 

=  (2.F.1) 

k 

It  then  follows  that 

a(;_l;m)  =  V^(/«).<>(2J-><-m))  =  ~^c*_2ma{j;i:).  (2.F.2) 

Similarly,  using  (2.E.4)  we  obtain 

=  (2.F.3) 
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The  multiscale  decomposition  (2.F.2)  -  (2.F.3)  can  be  implemented  using  filters 
whose  impulse  responses  are  g{n)  =  c_n/\/2  and  ij(n)  =  d-n/'s/?  followed  by 
decimators  as  in  Fig.  5.  In  Fig.  5  (j(w)  and  H{u)  are  the  discrete  Fourier 
transforms  of  g{n)  and  h{n)  respectively. 

Note  that  in  practice  one  is  given  samples  of  a  function  f{t)  at  a  given  scale 
2^ .  If  J  is  large,  those  samples  may  be  viewed  as  the  coefficients  a{J;m)  in  an 
approximation  a{J  \m)^{2^t  —  fn)  of  some  function  /(I)  at  scale  2^ .  Then  a 
wavelet  decomposition  of  /(<)  may  be  computed  recursively  as  shown  in  Fig.  6. 
Specifically,  the  samples  of  /(t)  are  recursively  low-pass  and  high-pass  filtered 
with  the  filters  ^(n)  and  h(n),  and  the  filter  outputs  are  decimated  such  that 
only  even-number  coordinates  are  kept. 

It  may  also  be  verified  that  the  sequence  {a(j;m)}  can  be  reconstructed  from 
the  approximation  and  detail  sequences  at  the  next  coarser  scale  {a(j  —  l;m}} 
and  {6(j  —  l;r7j)}as 

1;^)  +  -  !;<;)•  (2.F.4) 

The  reconstruction  (2.F.4)  can  be  implemented  by  upsampling  and  filtering  with 
O’fif’}  apH  s.<!  in  Fig  7. 


2.G  Finite  Data  Length  Discrete  Orthogonal  Wavelet  Trans¬ 
form 


With  a  finite  set  of  data,  it  is  not  possible  to  compute  all  a(j;m)’s  and 
6(j;  m)’s  exactly,  particularly  those  that  correspond  to  scaled  and  shifted  wavelets 
or  scaling  functions  straddling  the  boundary  of  the  interval.  To  handle  this 
problem  one  conventionally  assumes  that  the  data  is  periodic  outside  the  ob¬ 
servation  interval.  Suppose  we  are  given  a  finite  number,  N  =  2'^,  of  data 
{a{J;m)},rn  =  0.  •  ,2"^  —  1,  at  finest  scale  J.  Let  a-*  and  ^  denote  2^  x  1 
vectors  composed  of  2^  point  periodic  approximation  sequence  and  detail  se¬ 
quence  at  scale  2^ .  Let  the  2^~*  x  2-'  matrices  Gj  and  Hj  denote  the  matrix 
representation  of  filtering  the  periodized  data  using  g{n)  and  h{n)  respectively. 
Specifically,  Gj  and  Hj  are  given  by  [Gj]ii  =  a_2t/\/2  and  =  dk.iJ^/2 

Note  that  since  sequences  {c*}  and  {d*}  of  length  2p  are  required  to  produce  a 
compactly  supported  wavelet  with  p  vanishing  moments  [17],  we  are  restricted 
to  use  a  wavelet  with  p  number  of  vanishing  moments  such  that  2p  <  2^  at  scale 
2> ,  i.e.  we  have  to  use  wavelets  with  a  smaller  number  of  vanishing  moments  at 
coarser  scales. 

It  may  be  shown  that  Gj  and  Hj  have  the  following  properties: 


GjGj  =  HjHj  = 


GjHi* 


=  HjGj  =  0 
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{2.G,1) 

{2.G.2) 


Figure  5:  Analysis  filter 


Figure  6:  Discrete  wavelet  decomposition 
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Figure  7:  Synthesis  filter  pair 
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gTgj  +  hThj  =  i 

(2.G,3) 

With  these  filter  matrices  Gj  and  Hj,  the  finite  data  multiscale  decomposition 
may  be  expressed  as 

0^"*  =  Gj_io^ 

{2.G.4} 

(2.G.5) 

=  gTjoJ-1  +hTj^-’ 

(2.G.6) 

where  o'  and  V  are  2’  x  1  vectors  given  by 

flJ  =  (a(j;0).a(j;l),-  -.a(j;2-’ -  1)]^ 
V  =  [6{j.0).6(i;l).....6{j;2^  -  1)]^. 


From  (2  G,4)-(2.G.6)  we  have 

=  gJ  jGJ 

=  gJ_jgJ_2  gJo®  +  gJ_  j  Gj'Hja®  +  gJ_i  g^hJ^’ 

+  +  (2,G.7) 

Define 


and 


Q  = 


hJgJgT  gJ  ^ 

hJgTgT  .  gT^ 


6-  1 


{2.G.8) 
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(2.G.9) 


Then 


a-'  =  q'^b^  (2.G,10) 

and 

b^  =  Qo^.  (2.G.11) 

From  (2.G.l)-(2  G.3),  it  is  clear  that  Q  is  orthogonal,  i  e.,  QQ*^  =  Q^Q  =  I 
Equation  (2.G.11)  implies  that  wavelet  transform  can  be  computed  by  a 
multiplication  of  matrix  Q  and  a  data  vector  hence  we  will  call  the  matrix  Q 
the  discrete  wavelet  transform  (DWT)  matrix.  The  discrete  wavelet  transform, 
or  multiplying  a  vector  by  matrix  Q  is  more  efficiently  computed  in  the  following 
way 


i-'-i  ^  P 


,•'-1 


J-2 


b^  =  Qn-'  =  Po(Pi(P2(  -  (Pj.i£-')  )(2.G.12) 


where 


P 


J-1 


Gj.i 


.Pj-2 


rij-2  0 

0  Hj_2 

.  0  *^3-2 


<Pj-k 


Jj-k  0 

0  Hj.k 

0  ‘^J-k 


where  Im  is  an  2”’  x2”’  ident.ty  matrix  If  the  analyzing  wavelet  has  p  vanishing 
moments.  Hj  and  Gj  have  2p  number  of  nonzero  elements  per  row  Hence  the 
number  of  multiplications  for  computing  discrete  wavelet  transform  is  approxi¬ 
mately  4pN  where  A'  =  2'^.  Note  that  the  number  of  operations  for  computing 
DWT  is  proportional  to  A'  whereas  computation  of  Fr  T  requires  A’ log;  .V  op¬ 
erations. 


2.H  Generalization  of  Wavelet  Decompositions 

The  connection  between  wavelet  decompositions  and  2  band  perfect  recon¬ 
struction  filter  banks  was  briefly  discussed  in  the  above  subsection  Since  it  is 
possible  to  construct  M-band  perfect  reconstruction  filter  banks,  it  is  natural 
to  ask  whether  an  M-band  generalization  of  the  wavelet  decomposition  (2  D  1) 
exists.  It  turns  out  that  the  answer  to  this  question  is  affirmative.  In  partic¬ 
ular,  it  is  shown  in  [74]  (see  also  [69])  that  any  square  integrable  signal  /(f) 
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may  be  expanded  in  terms  of  a  scaling  function  <p{t}  and  A/  —  1  M-band  wavelet 
functions  as 

oo  M  — I  oo  oc  _ 

f{t)=  ^  \/Af-^a(J,m)0(2'^t-m)-H  Y'  ^ 

fn=-oo  n  =  lj=ym=-oc 

(2.H.1) 

The  scaling  function  ^(<)  obeys  a  two  scale  difference  equation  of  the  form 

4>ix)  =  '^ct4>(Mx~k)  (2.H,2) 

it 


while  the  wavelet  functions  tpnii)  are  given  by 

tP„{z)  =  ^d„,t<^{Mx-k).  n  =  1.2,  •  ,A/ -  1.  {2,H.3) 

k 


As  in  the  2*band  case,  the  A/  —  1  wavelets  v^’n(0  are  characterized  by  the  fact 
that  they  have  p  vanishing  moments,  with  p  >  1.  i.e. 


rr;„(t)dt  =  0 


m  =  0. 1,  •  .p  -  1. 


{2.H,4) 


Furthermore,  all  translates  and  dilates  of  the  A/  -  1  wavelets  Vnit)  are  mutually 
orthogonal.  Finally,  we  also  have 

M~1  J-l  oc  _  cc  ___ 

>/A/^6(j;rn)t’n(A/''<  —  m)  =  VM^a{J:m)o{2'^t  —  tv). 

n  =  l  j=-oc  m=-oc  m  =  -oc 

(2.H.5) 

i.e.  the  M-band  wavelet  decomposition  also  induces  a  multiresolution  decompo¬ 
sition  of  L'iH).  A  complete  characterization  of  the  coefficients  {ci}  and  {dn.t} 
that  lead  to  valid  wavelets  is  given  in  [74].  The  advantages  of  M-band  wavelet 
decompositions  over  the  2-band  case  are  the  additional  degrees  of  freedom  avail¬ 
able  to  the  designer  in  choosing  the  sets  {ci}  and  {dn.t}  and  the  more  compact 
signal  representations  that  generally  result  from  the  use  of  such  decompositions. 

avelet  decompositions  have  also  be  extended  to  the  M-dimensional  case. 
The  simplest  extension  is  mentioned  in  [37]  and  uses  scaling  functions  and 
wavelets  that  are  equal  to  simple  products  of  1-D  scaling  and  wavelet  functions 
More  flexible  non-  separable  wavelets  are  discussed  in  [13],  [33]  and  [67]. 
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3 


MAIN  RESULTS 


The  main  objective  of  this  research  was  to  explore  the  role  of  wavelet  the¬ 
ory  in  signal  and  information  representation  and  extraction.  More  specifically, 
during  the  period  February  1992  to  January  1993  we  studied  two  information 
extraction  problems;  separating  plane  waves  radiated  or  reflected  by  a  target 
from  a  background  noise  process  with  an  unknown  correlation  structure  and  es¬ 
timating  a  distributed  range-Doppler  target  reflectivity  map  to  within  an  error 
of  a  given  energy  using  a  minimal  number  of  radar  probing  waveforms. 

Our  goal  in  studying  the  first  problem  was  to  show  that  it  is  possible  to 
exploit  differences  in  the  structures  of  the  correlations  of  the  wavelet  transforms 
of  the  response  of  an  array  of  sensors  to  plane  waves  and  its  response  to  wide 
classes  of  correlated  background  noises.  We  succeeded  in  establishing  this  fact 
and  constructed  an  iterative  method  for  estimating  the  directions  of  arrival  of 
plane  waves  mixed  a  background  noise  process  with  an  unknown  correlation 
structure. 

In  the  second  problem,  our  aim  was  to  investigate  the  role  of  wavelets  in  radar 
imaging.  Wavelets  seemed  to  have  a  natural  role  in  radar  imaging.  A  wavelet 
representation  provides  a  decomposition  of  a  signal  in  terms  of  translates  and 
dilates  of  a  single  waveform.  On  the  other  hand,  the  reflection  process  in  radar 
imaging  induces  a  translation  and  dilation  of  the  transmitted  waveform.  Our 
work  showed  that  the  optimal  set  of  waveforms  that  must  be  trai.amitted  to 
obtain  the  best  estimate  of  any  range-Doppler  target  reflectivity  function  is  a 
set  of  N  wavelet  waveforms  that  are  derived  from  the  target  reflectivity  function 
itself. 

W'e  also  investigated  two  signal  representation  problems.  The  first  prob¬ 
lem  was  more  theoretical  in  nature  and  involved  constructing  a  simple  test  for 
the  completeness  of  an  arbitrarily  sampled  redundant  dyadic  wavelet  transform 
as  well  as  a  direct  computationally  efficient  method  for  reconstructing  a  signal 
from  a  complete  arbitrary  sampling  of  its  dyadic  wavelet  transform.  The  second 
problem  was  a  transparent  audio  coding  problem.  The  objectives  there  were  to 
establish  that  the  regularity  of  wavelets  is  important  in  coding  applications  and 
that  dynamic  adaptation  of  wavelets  to  the  underlying  signal  can  lead  to  sub¬ 
stantial  reductions  in  bit  rate  Another  goal  was  to  study  the  general  problem 
of  adaptively  choosing  an  analysis  discrete  orthogonal  wavelet  to  minimize  a 
cost  function  (e.g.,  bit  rate  in  this  application)  subject  to  satisfying  a  given 
constraint  (e  g.,  quality  of  the  encoded  signal  in  this  application). 

In  what  follows  we  briefly  summarize  the  results  that  we  have  obtained  in 
the  period  February  1992-January  1993.  A  more  complete  description  of  these 
results  can  be  found  in  several  papers  that  will  appear  or  have  been  submitted 
for  publication  in  the  IEEE  Transactions  on  Signal  Processing  and  the  IEEE 
Transactions  on  Image  Processing  [53],  [58],  [59],  [76].  Re-prints  or  pre-  prints 
of  these  papers  are  available  from  the  principal  investigator  and  from  AFOSR 
Summaries  of  these  papers  have  also  been  published  in  the  proceedings  of  two 
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conferences  and  a  workshop  [52],  [57],  [62],  [75]. 

We  continue  to  perform  research  in  the  aboveareas  and  on  two  related  prob¬ 
lems  (adaptive  beamforming  and  integrated  target  sensing  and  recognition)  with 
AFOSR  support. 
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3.A  Wavelet  Domain  Array  Processing 

Most  modern  high  resolution  bearing  estimation  techniques  assume  that  the 
background  noise  process  has  a  known  covariance  function.  These  techniques 
tend  to  perform  poorly  in  situations  where  the  covariance  structure  of  the  noise 
is  unknown.  To  address  these  situations,  several  approaches  have  been  proposed 
in  recent  years,  e.g.,  [34],  [47],  [45],  [56],  [70].Some  of  these  techniques  assume  a 
particular  model  for  the  noise  process,  e.g.,  an  auto-regressive-moving-average 
(ARM  A)  model.  Others  eliminate  the  effect  of  the  unknown  noise  covariance  by 
assuming  a  least  informative  prior  distribution  for  the  noise  covariance  function. 
In  our  work  we  derived  an  alternative  approach.  This  approach  requires  only 
that  the  spatial  correlation  function  of  the  noise  process  decays  asymptotically 
to  zero  as  the  distance  between  samples  of  the  noise  tends  to  infinity. 

For  simplicity  of  exposition,  we  will  assume  here  that  the  noise  process  and 
a  set  of  narrow  band  plane  waves  are  sampled  by  a  uniform  linear  array.  Note 
however,  that  the  ideas  described  here  can  be  applied  as  well  to  non-linear  and 
non-uniform  arrays  as  well  as  to  wide-band  signals. 

We  shall  denote  by  /(n)  the  output  of  the  sensor  in  the  array.  Recall 
that  in  an  M-band  wavelet  decomposition  a  signal  x(n)  is  decomposed  using  a 
regular  cascade  of  perfect  reconstruction  filter  banks  (c.f  Fig.  8).  Each  bank 
consists  of  Af  fillers  ift(r),  i  =  0, 1,  •  •  .  Af  -  1.  We  will  assume  here  that  the 
filters  k  =  I,  -  ■  ■  ,M  -  I  have  p  zeros  at  ;  =  1.  (Recall  that  each  of  these 

filters  must  have  at  least  one  zero  at  c  =  1.)  Note  that  by  using  the  “Noble 
identities"  of  Section  2.B  in  [64]  it  follows  that  Fig.  8. a  may  be  redrawn  as  in 
Fig.  8.b.  Following  wavelet  terminology,  we  shall  call  each  of  the  signals  rt(n) 
in  Fig.  8.b  the  “multiresolution”  component  or  the  “detail"  of  x(n)  in  band 
k.  We  also  introduce  a  partial  ordering  of  the  outputs  z'*{n)  by  numbering  the 
analysis  stages  from  left  to  right.  W'e  use  the  notation  p(k)  to  denote  the  order 
of  the  analysis  stage  in  which  Xk(n)  is  computed.  We  shall  then  say  that  r*(n) 
is  a  “higher  resolution”  component  of  i(n)  than  rt<(n)  if  p(k)  <  p(k'). 

Since  the  filters  Hk(z),  it  =  I,  •  •  • ,  Af  -  1,  in  Fig.  8  have  at  least  p  zeros  at 
2=1,  the  filters  Wt(z)  that  produce  the  details  of  f(n)  will  also  have  at  least 
p  zeros  at  I  =  1.  Hence,  if  we  use  u'i(n)  to  denote  the  impulse  response  of  the 
filters  W  t(;),  we  note  that 

^m"u;i(m)=:0  n  =  0,l,-  ,p-l  (3.A.1) 

m 

We  shall  see  in  the  next  section  that  (3.A.1)  will  play  a  key  role  in  obtaining 
sparse  representations  for  a  large  class  of  covariance  matrices. 
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(b) 

Figure  8:  A  Two  Stage  M-band  Decomposition 
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3.A.1  Asymptotic  Form  of  the  Correlation  Function  of  the  Wavelet 
Transform  Coefficients  of  the  Outputs  of  a  Linear  Uniform 
Array 


First,  observe  that  the  covariance  R(m,n)  of  /(n)  is  a  sampled  version  of 
a  continuous  2-D  function  R{t,s).  Denote  by  ii{;*(m,n)  =  E[fj{m)fk{n)]  the 
cross-  covariance  between  the  component  of  /(n)  in  the  band  and  the 

component  of  /(n)  in  the  band.  Observe  that  using  the  notation  that 

we  introduced  above  we  may  express  /2j;*(m,n)  as 


m'iS,  n'iSk 


—  m\  - 

(3.A.2) 


In  (3.A.2)  $k  is  the  support  of  the  filter  u;t(n),  i.e.,  the  set  of  indices  n  for  which 
|u)jfc(n)|  ^  0.  Our  objective  is  to  study  the  asymptotic  behavior  of  Rl;’‘[m.n) 
as  max(m,n)  tends  to  infinity  while  min(m,n)  is  finite  and  fixed  Without 
loss  of  generality  we  will  assume  here  that  n  =  min(Tn,n)  and  is  fixed  while 
m  =  max(m,n). 

Assume  now  that  all  the  partial  derivatives  R(t,s)/dt^ds'  of  R(t.s) 
exist  for  0  <  /,ib  <  2p.  Following  a  technique  introduced  in  [42]  to  study 
wavelet  decompositions  of  operators  and  also  used  in  [6],  we  first  use  Taylor's 
theorem  of  the  mean  for  smooth  2-D  functions  to  write 


2p-l 


(2p)!  L 


t  =  l 


1  s 
»  = 


2P 


R{t,s} 


,  =  Ari>)m-em'  for  some  0  <<?<  1 

j  =  Af'f'in  -  tn' 


(  = 

s  =  AfC'^'n 


+ 


1 

(2p)! 


(m')'(n')*'’-' 


d-f’R{t.s) 

dt’ds^P-‘ 


t  =  -  #m' 

<  =  -  <n' 


for  some  C  <  ^  <  1  • 


(3.A.3) 


If  we  substitute  this  expression  into  (3. A. 2)  and  use  the  fact  that  the  first  p 
moment.s  of  Uj  (ni)  and  u'i(n)  are  zero  we  obtain 


1 

(2p)! 


E(t)  E  E 

1=0  '  '  m'65,  n’€5» 
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< 


sup 

(fn*,fi*)  €  Sj  X  Sk 

^  €  [0.  M 


d^PR(t,s) 

dt‘d8^r>-‘ 


1  =  -  0m' 

«  =  -#r»' 


{(m')'(n')^^~'u/y(m')wt(n')| 


C 


sup 


(m',n')  € 

•6{0.ll 
I  €  {0. 1,  2p) 


d^f/i(i,s) 

dt'ds^P-' 


t  =  -«m' 

,  =  M'<‘>n  -  #n'. 


(3.A4) 


In  the  above  equation  C  is  a  finite  constant  given  by 

^=7^Ef  ?  )  E  IK)SK)|  Z  |(n')*'’-'«^*(n')l.  (3.A.5) 

/=0  '•  ^  m'€S,  n'eSk 

Hence,  as  long  as  ali  terms  I  |,  0  <  /  <  2p,  decay  faster  to  zero  than 

|iZ(m,  n)|  itself  as  m  tends  to  infinity  (i.e.  as  long  as  limm-oo(||j^T7=l  |)/|^(»ti.  ")t  = 
O.Vf  <  2p),  Ri;^(m,n)  will  decay  faster  than  R(m,n)  as  m  lends  to  infinity.  In 
particular,  the  multiresolution  components  of  *(n)  will  effectively  be  partially 
uncorrelated. 

Consider  now  the  case  where  R(m,n)  behaves  asymptotically  as 

yZ(m,  n)  =  +  o(l))  +  ai(n)  as  m -*  oo.  (3. A. 6) 


In  the  above  equation  the  function  ^(n)  may  have  both  a  real  and  an  imaginary- 
part.  However,  its  real  part  is  assumed  to  be  a  non-negative  and  non-decreasing 
function  of  n.  Its  imaginary  part  must  be  zero  if  its  real  part  is  zero.  The 
function  o(n)  is  only  assumed  to  be  a  non-increasing  function  of  n  The  class 
of  processes  with  covariance  functions  that  behave  asymptotically  as  (3. A. 6)  is 
quite  large.  For  example  it  includes  all  stationary  processes  that  can  be  obtained 
by  passing  white  noise  through  a  rational  filter.  The  covariance  functions  of  such 
processes  decay  exponentially  fast  to  zero  for  fixed  n  as  m  tends  to  infinity.  It 
also  includes  nonstationary  process  such  as  a  sampled  version  of  a  fractional 
Brownian  motion  [41]-  [2]  whose  covariance  function  is  given  by 

R(m,  n)  =  -f  |n|^^  -  [m  -  np"),  0  <  W  <  1. 

For  fixed  n,  this  covariance  function  behaves  asymptotically  as  (3.A.6)  with 
^(n)  =  0.  The  fractionally  differenced  white  noise  process  [26]-[20]  and  samples 
of  the  underwater  background  acoustic  noise  in  shallow  water  taken  along  any 
straight  line  [8]  are  other  examples  of  nonstationary  processes  with  covariance 
functions  that  behave  asymptotically  as  (3. A. 6). 

As  m  tends  to  infinity  and  since  the  sums  in  (3.A.2)  are  finite,  we  may 
rewrite  (3.A.2)  as 

n'e5»  m'€5. 
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(ao(M^^>n  -  n')  +  o(l))  +  ai{M^^^n  -  n')]} 

{3.A.7) 

where  we  have  used  (3. A. 6).  Recalling  that  3?(/?( ))  >  0,  define  the  quantities 
ao(A/'’^*^n)  and  as 

/JoCAf-^^^n)  =  -  T»o)  (3.A.8) 

no  €  R^  =  {n'€5t:S(^(A/'’<*>n-n')  =  Jnf^»(/?(Af'^*>n-n')} 

(3.A.9) 

ao(A/^^‘^n)  =  sup  -  n').  (3.A.10) 

n'€R# 


When  3f(/?o(M'’^*^n)  ^  0  it  may  be  readily  verified  that 


_ |/?{;*(m,n)| _ 

where 


Co(A/^'*>n), 

(3.A.11) 


co{M^^*^n) 


Ro 


n'gHo  m'iS) 

(3.A.12) 

=  W  e  K/5  :  -  n’}  =  sup  o(M'’<*)n  -  n')}.(3.A.13) 


This  of  course  implies  that  lRi;*(m,n)|  decays  asymptotically  as 
g-»(So(M'’<*)n)Af<-<^)m)  ^nd  q(.)  are  non-  decreasing  and  non-increasing 

functions  of  their  arguments  respectively,  this  also  means  that  |^i;*(m,n)|  de¬ 
cays  asymptotically  to  zero  at  a  rate  much  faster  than  /?(m,n). 

On  the  other  hand,  when  0(n)  =  0  identically  in  (3.A.6),  it  may  be  verified 
using  (3.A.1)  that 


I'  _ |f^-  (ni,n)j _ _  t 

A/Jim-oo  ~  ^ 

where 

ciro(Af'’^*^'*)  =  sup  o(M'’^‘^n  -  n') 
n'€5* 


(3.A.14) 


(3.A,15) 


n'€5*  m'SSj 


(3.A.16) 
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Note  that  in  this  case  as  m  tends  to  infinity,  R(m,  n)  may  either  decrease 
hyperbolically  (instead  of  exponentially)  fast  to  a  constant  which  may  be  a 
function  of  n.  It  may  also  increase  without  bound.  However,  according  to 
(3. A. 14)  n)  decreases  to  zero  asymptotically  as  as  long  as 

®o(  )  <  P-  In  particular,  /?{;*(m,n)  decreases  once  more  to  zero  asymptotically 
at  a  rate  much  faster  than  R{m,n). 

Finally,  observe  that  if  /?(m,n)  consists  of  oscillatory  components,  both  a(n) 
and  0{n)  in  (3.A.6)  will  be  zero.  In  this  case,  the  wavelet  transform  does  not 
decorrelate  the  process. 

The  observations  that  we  have  made  here  are  illustrated  in  Figs.  9. a  and  9.b. 
Fig.  9.a  shows  the  magnitude  of  the  entries  of  the  correlation  matrices  of  the  &- 
band  wavelet  coefficients  corresponding  to  five  uncorrelated  sources  of  powers  - 
17  dB,  -2.8  dB,  -23  dB,  -6.38  dB  and  -6.38  dB  (relative  to  the  total  noise  power) 
impinging  respectively  at  45®, 70*, 75°, 85®  and  90®  on  a  64  elements  uniform 
linear  array  with  inter-sensor  separation  of  A/2.  On  the  other  hand  Fig.  9.b 
shows  the  magnitude  of  the  entries  of  the  correlation  matrices  of  the  8-band 
wavelet  coefficients  of  a  2  pole  AR  background  noise  process.  The  poles  of 
the  process  were  at  0.96e"'**  and  0.7e’’^^.  Their  respective  powers  were 
-27  dB  and  -0.0087  dB  relative  to  the  total  noise  power.  This  noise  process  was 
constructed  by  Dr.  N.  Owsley  from  NUSC.  It  is  a  good  model  for  the  background 
noise  observed  in  the  ocean  in  a  particular  set  of  experiments.  Observe  that  the 
correlation  matrix  of  the  wavelet  transform  of  the  signal  part  consists  of  plateaus 
while  that  of  the  noise  consists  of  line  structures  . 

3.A.2  Wavelet  Domain  Bearing  Estimation  in  the  Presence  of  Cor¬ 
related  Noise  of  Unknown  Structure 

The  bearing  estimation  approach  that  we  proposed  is  based  on  the  results 
that  we  described  in  Section  l.A.l.  It  consists  in  computing  an  estimate  of 
the  correlation  matrix  of  the  M-band  wavelet  coefficients  corresponding  to  the 
noiseless  sensor  outputs.  The  procedure  begins  by  estimating  the  correlation 
matrix  of  the  M-band  wavelet  coefficients  corresponding  to  the  noisy  sensor 
outputs  using  an  average  of  the  outer  products  of  vectors  of  the  ^I-band  wavelet 
coefficients  corresponding  to  several  snapshots  of  the  sensor  outputs.  Next,  we 
zero  the  entries  of  the  estimated  correlation  matrix  that  lie  along  any  observed 
linear  structure.  These  linear  patterns  can  be  parts  of  diagonals  or  slanted 
diagonals.  They  are  easily  detected  using  image  processing  techniques.  The 
above  step  essentially  cancels  the  contribution  of  the  noise  correlation  matrix 
to  the  computed  wavelet  domain  correlation  matrix. 

Denote  by  R,  the  autocorrelation  matrix  corresponding  to  the  wavelet  co¬ 
efficients  of  the  autocorrelation  of  the  plane  waves.  One  is  then  left  with  a 
perturbed  version  of  the  matrix  Rj  with  missing  entries.  The  perturbation  is 
a  function  of  the  number  of  snapshots  used,  the  signal-to-noise  ratio  (i.e.  the 
accuracy  with  which  the  correlation  matrix  of  the  M-band  wavelet  coefficients 
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corresponding  to  the  sensor  outputs  is  estimated)  and  the  degree  to  which  the 
M-band  wavelet  decomposition  decorrelates  non-time-synchronous  wavelet  co¬ 
efficients  of  the  noise  process.  By  computing  the  eigenvectors  corresponding 
to  a  set  of  smallest  eigenvalues  of  the  matrix  R,  with  nnissing  elements  and 
by  using  the  inverse  wavelet  transform  of  the  corresponding  eigenvectors  in  an 
eigenstructure  based  technique  (e.g.  MUSIC,  ROOT  MUSIC,  etc.  [1]),  one  is 
then  able  to  compute  a  preliminary  estimate  of  the  frequencies  of  the  line  spec¬ 
trum  or  the  directions  of  arrivals  of  the  plane  waves.  Note  that  at  this  stage  it 
may  not  be  possible  to  resolve  all  sources.  Using  those  estimates,  one  then  com¬ 
putes  estimates  of  the  entries  of  R,  which  had  been  zeroed.  These  estimates  are 
then  used  to  fill  the  zeroed  entries  of  the  estimated  wavelet  correlation  matrix 
and  the  procedure  is  repeated  on  the  resulting  matrix.  The  whole  process  is 
repeated  until  convergence.  A  flow  chart  of  the  procedure  is  given  in  Fig-  10. 

The  procedure  implicitly  solves  a  non-linear  least  squares  fitting  problem. 
It  can  be  shown  that  it  is  guaranteed  to  converge  to  a  local  minimum  It  is  ini¬ 
tialized  with  a  guess  obtained  using  any  of  the  traditional  approaches  together 
with  any  other  information  that  may  be  obtained  by  looking  at  the  structures  of 
the  plateaus  in  the  correlation  matrix  of  the  wavelet  domc'n  data.  Experience 
has  shown  that  such  an  initialization  leads  to  good  results.  The  procedure  was 
compared  to  more  traditional  approaches  in  a  number  of  scenarios  including  the 
one  described  at  the  end  of  Section  l.A.l.  In  that  scenario,  none  of  the  tradi¬ 
tional  approaches  was  able  to  detect  the  sources  at  45®  and  75®  even  when  exact 
correlation  matrices  are  used.  On  the  other  hand,  the  proposed  approach  was 
able  to  detect  all  sources  when  exact  covariances  were  usee  or  w’hen  more  than 
20  snapshots  were  assumed  to  be  available.  Fig.  11  shows  the  results  obtained 
in  another  scenario  using  a  simpler  version  of  the  proposed  procedure  in  which 
positions  of  the  entries  which  are  zeroed  is  prefixed  to  be  the  main  diagonal 
and  some  slanted  diagonals  regardless  of  where  the  linear  structures  appear. 
The  exact  source  location  are  marked  with  tics  on  the  plots.  Each  simulation 
(one  curve  on  the  attached  plot)  assumes  that  100  data  snapshots  are  available 
for  processing.  Note  that  MUSIC  detects  one  source  and  is  unable  to  resolve 
two  others.  The  simplified  version  of  our  procedure  is  able  to  resolve  these  two 
sources  but  cannot  detect  the  other  two  weak  sources.  The  full  technique  is 
needed  to  detect  these  latter  two  sources 
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Figure  10:  Flow  Chart  Description  Of  The  Proposed  Bearing  Estimation  Ap 


Isotropic  noise 

source  location:  [45  70  85  90  105} 
source  power:  [0.02  0.515  0.23  0.23  0.005]  dBs 
number  of  source  estimate  »  5 


Figure  11:  Performance  Of  MUSIC  And  A  Simplified  Version  Of  The  Proposed 
Technique  In  A  Given  Scenario. 
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3.B  Optimal  Radar  Range-Doppler  Imaging 

The  main  contribution  of  our  work  in  the  area  of  radar  imaging  was  to  show 
that  the  most  accurate  reconstruction  of  a  range- Doppler  target  density  that  can 
be  obtained  using  only  N  waveforms  and  their  echoes  results  from  transmitting 
the  singular  functions  corresponding  to  the  N  largest  singular  values  of  two  ker¬ 
nels  derived  from  the  target  density.  These  singular  functions  are  valid  wavelets 
that  obey  an  additional  orthogonality  constraint  in  the  frequency  domain  We 
have  used  this  result  to  develop  a  solution  to  the  problem  of  choosing  a  set  of 
N  waveforms  to  reconstruct  with  high  accuracy  an  arbitrary  unknown  target 
range- Doppler  density  function. 

Note  that  in  practice  the  important  question  is  how  to  construct  the  most 
accurate  approximation  to  the  range- Doppler  target  density  by  illuminating  the 
target  for  a  maximum  of  T  seconds.  This  problem  is  related  to  the  one  that 
is  addressed  here  since  the  A'  waveforms  will  have  a  total  duration  equal  to 
the  sum  of  their  individual  durations  plus  the  duration  of  all  silence  intervals 
separating  the  waveforms.  The  length  of  these  silence  intervals  is  determined 
from  a  coarse  a  priori  knowledge  about  the  support  of  D(i.y).  If  the  radar 
selects  the  transmitted  waveforms  from  a  fixed  library  based  on  the  approach 
presented  here,  the  individual  durations  would  be  known  a  priori.  Since  the 
theory  developed  in  our  work  also  identifies  the  relative  importance  of  each 
transmitted  waveform  in  the  reconstruction,  it  is  possible  to  use  T  to  select  an 
appropriate  subset  of  A’  individual  waveforms  for  optimal  imaging  of  the  target 
in  less  than  T  seconds. 

3.B.1  Problem  Formulation 

Consider  first  a  point  reflector  at  a  distance  r(<o)  from  a  monostatic  radar 
at  time  to.  Let  s(t)  be  the  waveform  transmitted  by  the  radar.  W’e  will  assume 
here  that  the  echo  received  by  the  radar  at  time  t  is  given  by 

As{t-T(l))  {3B,1) 

where  A  is  a  constant  determined  by  the  reflection  properties  of  the  point  target 
and  the  propagation  characteristics  of  the  medium  and  T{t)  is  the  total  delay 
incurred  by  the  part  of  the  waveform  that  arrive^  at  the  radar  at  time  t.  Note 
that 

r(<)= -r«-ir(<))  (3.B.2) 

c  2 

where  c  is  the  velocity  of  propagation  of  the  electromagnetic  waves  in  the 
medium 

It  is  cor.venierit  to  express  7-(i)  as  a  power  series  over  the  signal  duration 
In  particular,  if  the  change  in  the  velocity  of  the  reflector  over  the  illumination 
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time  (duration  of  s(<))  is  negligible  compared  to  c,  then  we  find  that  [32] 


T(i)  w  r  + 


2t^(x/2) 
c+  t'(z/2) 


(t-r) 


(3,B,3) 


where  i  is  an  arbitrary  reference  time  instant  and  v(z/2)  is  the  velocity  of  the 
point  reflector  at  time  z/2  along  the  line  of  sight  Note  that  it  follows  from 
(3  B.2)  that  the  range  of  the  target  at  time  t  ~  z/2  is  cz/2.  Hence,  the  received 
waveform  at  time  t  will  have  the  form  Af{yit  -  z))  where  y  =  (c-  ii(z/2))/(c  + 
v{x/2)).  This  is  the  exact  broadband  form  of  the  echo  rather  than  the  usual 
narrow  band  approximation  used  in  the  literature.  This  formulation  of  the  echo 
takes  care  of  range-migration  effects  automatically. 

Suppose  now  that  the  target  consists  of  a  continuum  of  point  reflectors  lo¬ 
cated  at  various  ranges,  moving  with  different  velocities  with  respect  to  the 
radar  and  having  different  reflectivities.  Assuming  that  reflection  and  propaga¬ 
tion  are  linear  and  that  all  parts  of  the  target  are  illuminated  equally,  the  echo 
e(<)  due  to  the  target  will  have  the  form 


dx  £>(z,y)s(y(t  -  z)). 


(3.B-4) 


In  the  above  equation  D{x,y)  is  the  reflectivity  of  a  point  target  located  at 
range  cz/2  and  moving  with  velocity  (1  -  y)c/{l  +  y)  at  time  z/2  Since  a 
negative  range  is  meaningless,  D{x.y}  =  0  for  all  z  <  0,  Our  goal  in  the  next 
section  will  be  to  reconstruct  the  best  approximation  to  D(z.y)  by  recording 
the  echoes  due  to  a  fixed  number  of  transmitted  waveforms. 


3. B.2  Approximation  of  a  known  a  Z?(z,y)  using  a  set  of  waveforms 
and  echoes 

First  observe  that  (3.B.4)  has  the  same  form  as  an  inverse  wavelet  transform 
Specifically,  if  v^  (?)  is  a  valid  wavelet  function  then  any  square  iiuegrable  signal 
f(t)  can  be  written  as 

/(<)  =  7!-  /  dy  f  dx  F{x,y)^'yx.iy{i  -  T))  (3.B  5) 

V'v  70  7_oc 

where  is  a  constant  that  depends  on  t  (t)  and  F(x.y)  is  the  continuous 
wavelet  transform  of  /(t)  with  respect  to  v(<)  This  observation  seems  to  sug¬ 
gest  then  that  one  can  reconstruct  D(z,y)  by  transmitting  a  single  valid  wavelet 
function  t,'’(t)  and  computing  the  wavelet  transform  of  the  echo.  Unfortunately, 
as  observed  in  [44]  and  [36],  following  such  an  approach  simply  yields  the  pro¬ 
jection  of  D{x,y]  onto  the  range  of  the  wavelet  transform  with  respect  to  v{i) 
Clearly  the  problem  that  we  are  addressing  involves  reconstructing  a  2-D 
function  from  1-D  observations.  Thus,  in  general,  reconstructing  an  arbitrary 
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D{x,y)  will  require  transmittirjg  and  recording  an  infinite  number  of  wave¬ 
forms  and  echoes.  Suppose  on  the  other  hand  that  we  are  restricted  to  us¬ 
ing  N  or  less  waveforms.  How  can  we  optimize  the  choice  of  these  waveforms 
to  get  the  most  accurate  approximation  of  D(x,y)  in  the  norm  |l£)(r,j/)|p  = 
dy  dz|£)(x,y)p?  We  will  answer  this  question  by  first  assuming  that 
D(x,  y)  is  actually  known.  The  answer  that  we  obtain  will  then  guide  our  choice 
of  transmitted  waveforms  for  the  actual  case  where  D(x,y)  is  unknown. 

By  taking  the  Fourier  transform  of  (3.B.4)  we  obtain 

foo  5/  it  ) 

£(0.)=  /  A{w.y)— i-.  (3.B.6) 

Jo  y 


In  the  above  equation  A(u;,y)  is  the  Fourier  transform  of  D(x,y)  with  respect 


to  X.  i.e. 


A(w,y) 


D(x,  y)e~'’“’*dr. 


(3.B.7) 


Define  E+(ui)  =  £(w)  for  w  >  0  and  £+(w)  =  0  otherwise.  Similarly,  let 
£_(w)  =  £(u;)  for  w  <  0  and  £_(w)  =  0  otherwise  Now  let  us  change  the 
variable  of  integration  by  defining  u  =  u/y.  With  this  change  of  variable  of 
integration  (3.B.6)  yields  the  following  two  integrals 


and 


£+(-/)=  /  r+(w.t/)5(u)—  (3.B.8) 

Jo  ^ 

£.(w)  =  y”  r_(w,u)S(u)^.  (3.B.9) 


In  the  above  equations  7V(u^,i()  =  r(w,u)  for  w  >  0  and  r+(u;,u)  =  0  other¬ 
wise.  r_(w,u)  =  rfw.ii)  for  -w  <  0  and  T+(~j,u}  =  0  otherwise  and  T(^.u)  = 
A(w,w/u).  Note  that  T±{-,  •)  are  kernels  of  two  maps  that  take  L^(R± .  du/|ul) 
into  i^(R±.cL  ).  If  we  assume  that  D(x.y)  has  finite  energy,  i.e.,  if 


dy|£>(x,y)p  <  oc 


(3.B.10) 


then  it  may  be  shown  that  the  operators  corresponding  to  T±{-.  )  are  compact 
operators.  Hence  these  operators  admit  a  singular  value  decomposition.  In  par¬ 
ticular,  it  follows  that  the  best  approximations  to  either  r+(w,u)  or 
using  M  functions  S±,(u),  n  =  1,M  and  their  corresponding  echoes  (images 
under  T±(  -  ))  **  obtained  by  choosing  the  functions  S±„(u),  n-  \.h1 

to  be  the  singular  functions  of  7'+(w,w)  or  r_(u;,u)  corresponding  to  their  M 
largest  singular  values.  Nov  that  since  the  singular  functions  S±,(u)  of  T±(  .  •) 
belong  to  i^(R±,du/|u|)  they  are  the  Fourjtr  transforms  of  valid  wavelet  func¬ 
tions.  Specifically,  we  have 


du  <  oc 


(3  B.ll) 
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which  is  the  admissibility  condition  for  wavelet  functions. 

Hence,  if  D{x,y)  is  known,  the  best  2  norm  approximation  to  it  that  uses  N 
waveforms  and  their  corresponding  echoes  can  be  computed  by  choosing  a  set 
of  N  waveforms  from  the  set  of  2N  singular  functions  of  T+(u),  u)  and  T-(uj,u) 
corresponding  to  their  N  largest  singular  values.  The  precise  choice  is  made  by 
computing  the  reduction  that  will  occur  in  the  norm  of  the  error  by  including 
a  particular  singular  function  of  T+(u,u)  versus  one  of  T_(w,u).  The  singular 
functions  are  considered  in  the  order  in  which  their  corresponding  singular  values 
appear  when  arranged  in  order  of  decreasing  magnitude. 

3.B.3  Reconstruction  of  an  unknown  D(x,y) 

The  results  of  the  last  section  provide  a  general  guideline  for  choosing  A' 
waveforms  for  optimal  reconstruction  of  a  range-Doppler  distribution  D{x,y). 
Specifically,  the  waveforms  should  be  close  approximations  to  the  singular  func¬ 
tions  of  T+(w,u)  and  T_(w,ti)  corresponding  to  their  N  largest  singular  values. 
The  problem  of  course  is  that  D{x,y)  is  unknown. 

It  turns  out  that  it  is  possible  to  find  a  set  of  functions  that  would  act  as  ap¬ 
proximate  singular  values  for  all  kernels  r+(w,u)  and  T_(u,’,u)  that  correspond 
to  finite  support  range-Dopple  densities  D(x,y).  In  particular,  if  the  Fourier 
transforms  S±,(u)  are  chosen  such  that  each  5±,(ln(u))  is  equal  to  a  particular 
translate  and  dilate  of  the  same  discrete  wavelet  function  with  a  large  number  of 
vanishing  moments  then  they  will  provide  a  good  approximation  to  the  singular 
functions  of  all  kernels  r+(u;,u)  and  r_(w,u)  that  correspond  to  finite  support 
range-Dopple  densities  D(x,y)  [58]. 

We  still  need  to  impose  an  ordering  on  these  approximations  to  the  singular 
functions  of  the  kernels  r+(w,u)  and  r_(w,u)  that  we  expect  to  encounter. 
Specifically,  we  are  really  interested  in  approximating  the  singular  functions 
of  T+{u<,u)  and  r_(w,u)  corresponding  to  their  largest  singular  values.  This 
can  again  be  done  using  the  fact  that  D[x,y)  has  a  finite  support  in  the  (x,y) 
plane.  The  finite  support  constraint  implies  that  A(u>.3/)  is  a  smooth  function 
in  w  (its  Fourier  transform  with  respect  to  w  is  a  “low  pass"  function)  and  has  a 
finite  support  in  the  y  variable.  This  fact  enables  us  to  predict  the  asymptotic 
behavior  of  the  wavelet  coefficients  in  a  2-D  discrete  wavelet  decomposition  of 
either  7+(w,u)  or  T’_(w,u)  [10]. 

More  generally,  if  we  can  collect  data  corresponding  to  several  representa¬ 
tive  D(x,y)  profiles  we  may  use  the  following  two  step  adaptive  range-Doppler 
imaging  scheme.  In  the  first  step  we  classify  the  available  D(x,y)  densities  into 
several  classes  using  a  clustering  algorithm  based  on  a  norm  criterion,  e  g  the 
2-  norm.  Next,  for  each  class  we  compute  a  set  of  N  waveforms  that  act  as 
approximations  to  the  singular  functions  corresponding  to  the  largest  singular 
values  of  the  kernels  T+{u),u)  and  in  this  class.  We  also  construct  a 

set  of  N  fixed  waveforms  to  be  used  in  a  pre-imaging  classification  step.  We 
will  refer  to  these  waveforms  as  the  classification  waveforms.  This  first  step  is 
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Figure  12:  Actual  D(x,y) 


done  off-line. 

The  second  step  is  performed  on  line  during  actual  imaging  of  a  target .  The 
radar  first  transmits  the  classification  waveforms.  A  vector  quantization  rou¬ 
tine  then  uses  the  approximate  D{x,y)  reconstructed  using  these  waveforms  to 
determine  the  class  of  range-Doppler  densities  to  which  the  observed  D{x,y) 
belongs.  The  radar  finally  transmits  the  appropriate  set  of  waveforms  corre¬ 
sponding  to  the  identified  class  to  obtain  a  higher  resolution  image  of  D{x,y). 
The  details  of  this  procedure  are  given  in  [58]. 

3.B.4  A  Simulation  Example 

Let  us  illustrate  the  above  technique  with  a  simple  example.  Assume  that  it 
is  desired  to  image  the  distribution  D(x,y)  shown  in  Fig.  12.  This  distribution 
consists  of  two  2-D  Gaussian  functions.  The  optimal  reconstructions  that  we 
can  obtain  by  sending  one  or  two  properly  chosen  wavelets  are  shown  in  Figs. 
13  and  14.  Note  that  Fig.  14  is  essentially  D{x,y).  This  is  the  case  because 
D{x,y)  is  actually  a  rank  2  kernel.  Hence,  it  can  be  reconstructed  exactly  using 
two  properly  chosen  waveforms. 


Figure  13:  Reconstructed  D{x,y)  Using  1  Wavelet. 


Figure  14:  Reconstructed  D{x,y)  Using  2  wavelets. 
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3.C  Signal  reconstruction  from  an  Arbitrary  Sampling  of 
Its  Dyadic  Wavelet  Transform 

Wavelet  transform  maxima  and  zero-crossings  signal  representations  have 
been  investigated  in  both  image  and  speech  processing  areas.  Specifically,  such 
representations  can  be  used  for  image  edge  detection  and  multiscale  image  edge 
representations  [40,  38,  39],  speech  pitch  detection  [30],  and  auditory  represen¬ 
tation  of  acoustic  signals  [71]. 

It  has  been  conjectured  that  such  a  signed  representation  is  complete  (or 
unique)  [11,  40,  38,  39].  However,  recently  in  [3,  4]  some  counter  exeunples  have 
been  given  to  show  the  non-completeness  of  such  representations.  In  particular, 
a  necessary  and  sufficient  condition  of  the  completeness  has  also  been  given  in 
[3,  4].  Unfortunately,  checking  this  condition  involves  determining  the  rank  of 
a  matrix  that  can  have  a  potentially  large  size.  Therefore,  this  condition  is  not 
practical. 

In  our  work,  we  gave  a  necessary  and  sufficient  condition  for  the  completeness 
of  the  discrete  dyadic  wavelet  transform  extrema  (or  zero-  crossing)  represen¬ 
tation  for  discrete  finite  data  length  signals.  These  signals  are  most  commonly 
used  in  practice.  We  showed  that  the  uniqueness  of  such  a  representation  de¬ 
pends  only  on  the  locations  of  the  wavelet  transform  extrema  (local  maxima 
and  minima).  Hence,  our  results  apply  as  well  to  any  arbitrary  sampling  of  a 
redundant  dyadic  wavelet  transform.  Furthermore,  we  explained  why  the  con¬ 
clusions  in  [1,2,3]  hold  for  most  of  the  signals  except  for  some  extreme  cases  like 
those  in  [“i..^].  We  discussed  the  numerical  stability  of  sampled  dyadic  wavelet 
representations.  Stability  is  important  in  the  sense  that  a  numerically  unstable 
representation  is  useless  from  a  practical  point  of  view.  Finally,  we  described 
a  fast  Fourier  transform  (FFT)  based  algorithm  for  recovering  a  signal  from  a 
complete  arbitrary  sampling  of  its  dyadic  wavelet  transform. 

3.C.1  Dyadic  Wavelet  transforms  and  Sampled  Dyadic  Wavelet  trans¬ 
forms 


A  fast  dyadic  discrete  wavelet  transform  for  the  discrete  signal  /(n)  is  com¬ 
puted  by  passing  f(n)  through  a  cascade  of  filter  banks  that  consist  of  low-pass 
and  high-pass  filters  ~  Ho(2^u))  and  Gjiw)  =  Go(2^w)  (c.f.  Fig.  15). 

The  decomposition  is  essentially  similar  to  a  discrete  orthogonal  wavelet  de¬ 
composition  in  which  all  decimators  have  been  moved  to  the  last  stage  and  then 
eliminated.  The  prototype  filters  ffol^v)  and  Go(w)  must  satisfy  certain  con¬ 
ditions.  In  particular,  //o(w)  is  a  low-pass  filter  with  a  zero  of  some  order  at 
w  =  7r,  and  Go(w)  is  a  high-pass  filter  with  a  zero  of  same  order  at  w  =  0.  and 
|//j{w)p  +  \Gj{u))\^  =  1.  Furthermore,  Hoiw)  and  Go(--')  have  no  other  zero  on 
the  unit  circle. 

It  is  obvious  that  this  decomposition  is  redundant.  For  example,  if  we  choose 
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Ho  Hi 

/(n)— ^Si/{nr-r*‘Sj/(n) 


Hj.i 


Sj/(n) 


Wi/(n)  W2/(n)  Wjf(n) 


Figure  15;  A  Fast  Dyadic  Wavelet  Transform. 


the  filters  Nj  and  Oj  such  that  they  form  a  perfect  reconstruction  filter  bank 
then  we  can  decimate  the  filter  outputs  by  a  factor  of  two  with  no  loss  of 
information.  The  resulting  decomposition  is  then  called  an  orthonormal  wavelet 
transform.  Another  approach  given  by  Mallat  [40,  38,  39]  is  to  keep  only  the 
locations  and  the  values  of  the  local  extrema  or  zero-crossings  of  Vy,/(n)  of  the 
wavelet  decompositions.  We  call  such  a  signal  representation  a  partial  dyadic 
wavelet  transform  (PDWT)  signal  representation. 

In  our  work,  we  analyzed  the  redundant  wavelet  transform  of  a  finite  size 
sequence  /(n).  As  is  customarily  done  in  this  case,  we  assumed  that  the  given 
sequence  corresponds  to  a  periodic  signal  with  a  period  equal  to  the  length  of 
the  given  sequence.  With  this  assumption,  filtering  signal  f(n)  with  and 

G^(u/)  is  equivalent  to  multiplying  an  A'  x  1  vector  /  that  consists  of  the  A" 
sample  values  of  /{n})  with  circulant  matrices  and  of  dimension  A"  x  A'. 
Specifically,  denote  by  Wj/  the  value  of  the  discrete  wavelet  decomposition 
at  scale  2^.  The  A'  x  1  dyadic  wavelet  transform  vector  Wjf  is  obtained  by 
multiplying  the  A'  x  1  vector  /  of  signal  samples  by  the  circulant  matrix. 

yv)=no-nj.2^j-i  (s.c.i) 

The  smoothed  version  at  scale  2"'  is  Sjf-  It  is  equal  to  the  product  matrix, 

=  (3.C.2) 

The  properties  of  and  G(u)  imply  that  the  rank  of  Sj  is  not  N .  Hence 

it  is  not  possible  to  reconstruct  /  from  Sj /.  The  question  then  is:  “Is  it  possible 
to  reconstruct  /  from  a  PDWT?”  To  answer  this  question,  we  write  the  PDWT 
of  the  signal  representation  as 

Wf  (3.C.3) 

Here  the  matrix  W  is  called  partial  dyadic  wavelet  transform  (PDWT)  matrix 
and  is  defined  as, 

H’=  [5J  infp.  I  •••  IWX.J’'.  (3.c,4) 

The  vectors  pj  =  [p>{0]  consist  of  the  indices  of  the  samples  of  Wjf  that  have 
been  retained  (e.g.  the  extrema  locations  mj ,  or  zero-crossing  locations  Zj )  at 
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scale  j.  The  matrices  VVj.p^  consist  of  the  rows  of  Wj  corresponding  to  the 
locations  in  pj . 

The  advantage  ol  the  above  matrix  form  is  that  the  PDWT  problem  car.  be 
directly  studied  by  examining  the  matrix  W,  For_example.  the  representation  is 
complete,  (that  is,  there  exist  no  other  signals  /  with  the  same  representation 
Wf)  if  and  only  if  rank  of  matrix  W  is  N. 


3.C.2  Completeness  and  stability  of  a  Sampled  Dyadic  Wavelet  trans¬ 
form 

As  noted  above,  the  PDWT  signal  representation  is  said  to  be  complete  (or 
unique)  if  and  only  if  there  exist  no  other  signal  g  other  than  /  such  that  Wg  = 
W/.  This  completeness  (or  uniqueness)  has  been  conjectured  in  [40,  38,  39] 
because  no  counter  examples  had  been  found.  From  our  discussions  in  the 
previous  section,  the  completeness  of  PDWT  can  be  describe  by  the  following 
lemma. 

Lemma  1  The  PDWT  signal  represtnlation  is  compleie  if  and  only  if  rank(W) 
=  N. 


The  proof  to  this  lemma  is  obvious.  Notice  that  a  similar  result  has  been 
derived  in  [3,  4].  As  mentioned  above,  it  is  rather  difficult  to  check  the  com¬ 
pleteness  of  a  PDWT  using  this  Lemma  because  the  size  of  W  can  be  large. 

Using  the  above  lemma  and  the  fact  that  the  eigenvalues  of  an  N  x  A’ 
circulant  matrix  C  are  the  discrete  Fourier  transform  coefficients  of  elements  of 
C  and  the  corresponding  eigenvectors  are  the  columns  of  the  N  x  N  Fourier 
transform  matrix,  we  proved  the  following  completeness  theorem. 


Theorem  1  The  PDWT  representation  is  complete  if  and  only  if 

;=i 


where 


the  nvmber  of  distinct  values  pj(i)  mod  2"^ 

for  1  <  j  <  J  -  1 

the  number  of  distinct  values  pj(i)  mod 

for  j  =  J 


{3.C.5) 


(3.C.6) 


We  outline  here  the  main  argument  used  to  prove  this  result.  Since  Sj(w)  = 
Ho(u>)  ■  ■  ■  W/_i(ui)  has  zeros  of  some  order  only  at  w  =  ?r,  ±i,  •  •  • ,  ±  jrrr. 

rank(5j)  =  JV -2^ -I- 1.  (3.C.7) 
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The  eigenvectors  corresponding  to  the  zero  eigenvalues  of  5^  would  be  ), 
where  the  x  1  Fourier  vector  f{n)  is  defined  as 

:f(n)=  (3.C.8) 

Since  the  information  in  /  that  was  lost  in  Sj  f  should  be  present  in 

Wp/ =  [wf,.  I  ...  I  Wj„f  f,  (3.C,9) 

the  matrix, 

Wp  (3.C.10) 

should  have  rank  2"^  —  1  to  guarantee  completeness.  This  argument  serves  as 
the  basis  of  the  proof  of  the  above  Theorem.  The  detailed  proof  can  be  found 
in  [76]. 

The  above  Theorem  is  important  because  it  provides  us  with  a  simple  way 
of  checking  the  completeness  of  any  sampled  dyadic  wavelet  representation.  In 
particular,  our  test  does  not  involve  computing  the  rank  of  any  matrix. 

Up  to  this  point,  we  discussed  the  signal  compensation  of  the  smoothed  signal 
Sjf  corresponding  to  the  zero  eigenvalues  of  Sj.  To  guarantee  the  numerical 
stability  of  PDWT  representation,  we  need  to  consider  also  the  eigenvalues  of 
Sj  of  negligible  magnitude.  By  slightly  modifying  the  above  theorem,  one  can 
derive  a  similar  condition  on  the  number  and  location  of  the  samples  that  are 
retained  to  guarantee  a  unique  and  numerically  well  behaved  representation 
We  refer  the  interested  reader  to  [76]  for  more  details. 

We  give  here  a  simulation  result.  Fig.  16  shows  a  wavelet  and  its 
scaling  function  <^{x).  Fig.  17  shows  a  signal  /  equal  to  a  row  in  an  image  and 
its  wavelet  transform  up  to  scale  J  =  4  by  using  the  wavelet  given  in  Fig  16 
Fig.  18  shows  the  maxima  representation  of  the  wavelet  transform  in  Fig.  17. 
We  can  examine  the  sampling  locations  of  local  maxima  using  the  theorem  and 
conclude  that  the  representation  is  complete.  Next,  we  detect  the  “edge"  of  the 
image  row.  that  is,  keep  only  the  samples  corresponding  to  the  same  maxima 
locations  at  all  scales  as  shown  in  Fig,  19.  Using  the  completeness  theorem,  we 
find  that  the  representation  is  still  complete. 

3.C.3  Signal  Reconstruction  from  a  Sampled  Dyadic  Wavelet  trans¬ 
form 

The  problem  now  becomes  how  to  reconstruct  the  original  signal  /  from  the 
PDWT  representation  W.  The  above  discussion  gives  rise  to  an  FFT  based  re¬ 
construction  a,gorithrr..  In  particular,  the  original  signal  /  can  be  reconstructed 
as  follows  from  a  PDWT 

/  =  W’W7.  (3.C.11) 
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Figure  16:  Scaling  Function  <p(x)  and  Wavelet  t/'(x). 


In  the  above  equation  where  denotes  the  pseudo  inverse  of  W.  Typically, 
the  above  equation  will  be  difficult  to  be  implement  because  of  the  possibly 
large  size  of  W. 

Instead  of  using  (3.C.11)  we  may  proceed  as  follows.  First,  we  recover  the 
information  about  /  that  is  in  Sjf.  Specifically,  we  construct  an  approximation 
/,  to  /  as 

(3.C.12) 

In  the  above  equation  Q  denotes  the  N  x  N  Fourier  transform  matrix  and 
is  its  complex  conjugate  transpose.  The  diagonal  entries  5j(n)  are  given  by 


if  Sj(n)  ^  0 
otherwise. 


(3.C.13) 


The  difference  /,  between  /  and  f,  is  then  finally  obtained  as 

=  F(WpF)»(Wp(/ -  f,))  (3.C.14) 

where 


F  = 


(3.C.15) 


and  denotes  the  pseudo  inverse  of  A.  The  advantage  of  this  representation 
is  that  it  involves  solving  for  2'^“*  —  1  unknowns  only. 


Figure  18:  Wavelet  Transform  Extrema  Representation  of  /. 
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3.D  Adaptive  wavelet  Representations  for  Low  Bit  Rate 
High  Quality  Audio  Coding 

In  many  application,  such  as  the  design  of  multimedia  workstations  and 
high  quality  audio  transmission  and  storage  the  goal  is  to  achieve  transparent 
coding  of  hi-fidelity  audio  signak  at  bit  rate  below  64  kb/s.  Typically,  the 
quality  of  Compact  Disk(CD)  signals  is  used  as  the  standard  for  high  fidelity 
These  signals  are  characterized  by  a  high  bandwidth  (sampling  rate  of  44.1kHz), 
and  a  high  quality  (16  bits/sample  PCM  coding),  resulting  in  a  high  bit  rate 
of  705  kb/s.  Reducing  this  requirement  while  maintaining  a  near  transparent 
quality  is,  therefore,  crucial  in  the  above  applications. 

In  other  applications  the  objective  is  to  synthesize  music  signals  at  or  below 
10  kb/s.  The  transform  and  subband  coders  have  generally  been  found  to  be 
unsuitable  at  these  rates  and  the  low  bit  rate  speech  coders  ate  almost  invariably 
based  on  the  Linear  Predictive  Coding  (LPC)  algorithms  or  its  extensions.  Since 
the  LPC  model  relies  on  the  human  voice  production  mechanism,  it  may  not 
be  suitable  for  music  and  other  non-speech  sounds.  (Note  that,  some  studies 
suggest  that  Multi  Pulse  LPC  may  be  viable  for  this  task  [50]  .) 

In  general,  an  audio  coding  scheme  must  exploit  irrelevancies  (which  result 
from  the  masking  properties  of  human  hearing)  and  statistical  redundancies  in 
the  signal.  The  currently  known  methods  for  hi-f.  audio  compression  [66]. [29]. 
generally  concentrate  on  exploiting  the  masking  properties  by  looking  at  a  suit¬ 
able  transform  or  subband  components  of  the  signal.  We  proposed  an  audio 
coder  that  attempts  to  maximize  the  compression  by  utilizing  both  the  above 
sources  of  redundancies  in  a  signal.  To  exploit  masking  a  discrete  wavelet  trans¬ 
form  (DWT)  based  Adaptive  Transform  Coding  method  is  employed.  The  DWT 
is  attractive  for  audio  compression  because  it  acts  like  a  Karhunen  Loeve  Trans¬ 
form  (KLT)  for  a  large  class  of  signals  (including  some  non-stationary  signals) 
[61].  Furthermore,  it  offers  the  flexibility  of  choosing  a  basis  matched  to  the 
given  data.  The  proposed  coder  incorporates  an  opiimal  wavelet  basis  search 
procedure  to  utilize  this  flexibility.  The  DWT  part  of  the  coder  builds  on  our 
previous  work  [51].  With  recent  improvements,  it  is  capable  of  maintaining  a 
near  transparent  quality  at  approximately  64  kb/s. 

To  eliminate  the  statistical  redundancies  in  the  signal,  the  DWT  encoder  is 
augmented  by  a  first  stage  of  dynamic  dictionary  based  encoding.  Experiments 
w’ith  several  audio  samples  suggest  that  this  results  in  a  significant  coding  gain. 
The  overall  two-stage  coder  offers  a  solution  to  first  of  the  audio  compression 
problems  noted  above;  i.e.,  it  allows  transparent  coding  of  CD  quality  audio 
signals  at  rates  below  64  kb/s.  Preliminary  studies  suggest  that  it  is  also  a 
viable  approach  for  the  low  bit  rate  synthesis  of  music  signals  at  about  10  kb/s. 

It  should  also  be  noted  that  the  decoding  part  of  this  algorithm  has  been 
implemented  in  real  time  on  a  Texas  Instruments,  TMS320C31  chip.  The  coder 
complexity  is  still  high,  even  though  significant  speedup  in  the  wavelet  adapta¬ 
tion  algorithm  has  been  achieved.  Also,  the  coder  in  its  present  form  leads  to  a 
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variable  data  rate.  The  implementation  of  our  procedure  may  therefore  require 
the  use  of  a  buffering  scheme  in  some  practical  applications. 

3.D.1  Particulars  Of  The  Audio  Compression  Method 

The  overall  coding  algorithm  is  illustrated  in  Figure  20.  In  what  follows, 
we  assume  that  the  incoming  stream  of  audio  data  has  been  sampled  at  44.1 
kHz.  It  is  divided  into  short  frames.  For  each  of  these  frames  a  dictionary  entry 
that  is  perceptually  closest  to  the  signal  is  identified  The  chosen  entry  is  tl  “n 
subtracted  from  the  signal  to  form  an  error  vector.  Both  the  signal  and  the 
error  are  encoded  using  an  optimal  WT  based  approach,  and  shorter  of  the  two 
codes  is  transmitted  to  the  receiver. 

The  audio  coder  uses  two  different  analysis  frame  sizes  of  1024  and  2048 
samples  (46  and  23  ms).  The  ends  of  these  frames  are  tapered  by  the  square- 
root  of  a  Hanning  window,  the  resulting  overlap  corresponds  to  an  oversampling 
ratio  of  6.67%.  The  longer  frame  results  in  lower  bit  rates  and  is.  therefore, 
employed  most  of  the  time.  The  shorter  frame  length,  on  the  other  hand,  is 
used  for  frames  which  are  identified  to  be  susceptible  to  the  “pre-echo”  effect, 
based  on  an  “energy-entropy”  criterion  described  in  a  later  section.  It  may  be 
noted  that  neither  of  the  two  frame  sizes  provide  the  time  resolution  determined 
by  backward  masking  of  sharp  trans'^nts.  i.e.,  4  ms.  But.  the  proposed  coder 
contains  additional  safeguards  against  pre-echos  and  these  are  also  described  in 
a  later  section. 

3.D.2  Discrete  Wavelet  Transformation  of  Audio  Frames 


The  classical  formulation  of  DWT  leads  to  “octave  bandwidth”  filterbanks. 
i.e.,  filter  bandwidth  is  proportionately  higher  for  the  high  frequency  bands 
Frequency  resolution  of  such  filterbanks  does  not  quite  match  the  frequency 
dependent  resolution  required  for  full  exploitation  of  simultaneous  masking  (i.e.. 
the  “critical  band”  structure  [29]).  To  alleviate  this  problem,  we  used  the  wave 
packet  representations  [14].  In  such  a  representation,  the  low  and  high  frequency 
bands  are  further  split.  This  representation  retains  me  flexibility  offered  by  the 
DWT  in  terms  of  choosing  an  optimal  basis.  The  wave-packets  may  be  used  to 
form  any  arbitrary  binary  tree  structured  decomposition.  In  our  work  we  use  a 
tree  containing  a  maximum  of  29  bands  that  closely  r‘'s<  i  bles  the  critical  band 
scale. 


3.D.3  Perceptual  Masking  Constraint  in  the  Wavelet  Transform  Do¬ 
main 

Perceptual  masking  or  simultaneous  masking  is  a  phenomenon  whereby  a 
strong  signal  masKs  a  simultaneously  occuring  weak  (noise)  signal.  The  impor¬ 
tance  of  masking  in  a  coding  system  is  that  a  perceptually  transparent  quality 
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Figure  20:  Block  diagram  of  the  proposed  coder 


is  maintained  as  far  as  the  power  of  reconstruction  error  in  each  of  the  critical 
bands  is  below  a  signal  dependent  threshold.  In  our  computations  of  the  noise 
masking  threshold,  we  use  the  following  particulars:  FFT  analysis  for  windowed 
signal  frame,  assumption  of  additivity  for  the  masked  power,  and  a  model  for 
inter/intra  frequency  masking  proposed  in  [66],  Additionally,  the  threshold  is 
forced  to  be  constant  for  each  of  the  wavelet  (critical)  bands. 

Under  the  assumption  that  the  transform  coder  “whitens"  the  quantization 
error  (which  turns  out  to  be  the  case),  the  threshold  of  masking  may  be  enforced 
by 

CqRqeq  <  £)  (3.D1) 

where  cq  is  the  error  vector  in  the  quantization  of  DWT  coefficients,  Z?  is  a 
threshold,  and  Rq  is  a  positive  definite  matrix  which  depends  on  the  choice  of 
the  wavelet.  In  general  this  matrix  is  not  diagonal.  However,  in  [53]  we  show 
that  if  the  analyzing  wavelet  has  a  large  number  of  vanishing  moments  then 
is  very  close  to  being  diagonal.  Then  the  error  criterion  (1)  reduces  to 

^cnie)€l<D  (3.D2) 

k 

where  0  is  a  vector  of  parameters  that  identifies  the  selected  wavelet,  e*  is  the 
error  in  the  k*’'  \VT  coefficient,  and  c*t(0)  is  the  it"'  diagonal  entry  of  Rq 
The  WT  domain  threshold  cn(0)  remains  constant  for  a  particular  subband 
[53], 
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3.D.4  Audio  Data  Compression  By  Wavelet  Optimization 

In  our  code,  an  optimal  wavelet  is  identified  for  each  frame  of  the  audio 
signal  (N  samples)  and  the  bits  are  optimally  allocated  among  the  different  WT 
coefficients  to  minimize  the  overall  bit  rate  requirement.  In  [53],  we  show  that 
for  a  particular  choice  of  a  wavelet,  the  minimum  number  of  bits  for  quantizing 
the  WT  coefficient  is  given  by  (assuming  a  large  number  of  vanishing  moments 
for  the  wavelet), 

^in(0)  =  I  ^  log!  (3.D.3) 

where  fft  is  the  peak  value  of  the  k*’*  coefficient  (if  a  uniform  adaptive  quantizer 
is  employed),  and  the  constant  C  =  (<  is  a  “safety  factor”  close  to  1) 

The  it**  term  in  the  above  summation  also  gives  the  optimum  bit  allocation, 
>  for  the  Jt‘*  WT  coefficient.  For  a  particular  choice  of  a  wavelet,  the  bit 
rate  requirement  may  then  be  computed  directly  from  the  transform  coefficients 
using  the  above  equation. 

If  an  attempt  is  made  to  identify  the  best  wavelet  by  searching  in  the  pa¬ 
rameter  space,  the  overall  search  complexity  can  be  extremely  high.  In  [53], 
we  conclude  that  for  longer  scaling  sequences  (of  size  K),  the  search  may  be 
limited  to  the  set  of  wavelets  with  maximal  number  (K/2)  of  vanishing  mo¬ 
ments,  with  near  optimal  results.  The  families  of  K  coefficient  A'/2  vanishing 
moment  wavelets  are  generated  using  the  results  in  [17]  (e  g.,  for  K  =  20,  there 
are  exactly  1024  such  wavelets).  Furthermore,  a  fast  search  procedure  may  be 
developed  using  the  fact  that  all  K  coefficient  K/2  vanishing  moment  wavelets 
result  in  filter  banks  with  identical  magnitude  response;  these  differ  only  in 
terms  of  their  phase  responses  (delay).  The  fast  algorithm  significantly  reduces 
the  complexity  of  search  by  limiting  the  optimization  to  the  last  stage  of  the 
wavelet  decomposition  tree. 

The  information  that  needs  to  be  transmitted  to  the  receiver  to  recompute 
bit  allocation  is  {<^k ,Ckk}\^i  ,  a^id  the  choice  of  the  wavelet  0.  It  was  noted 
earlier  that  only  29  values  of  cn  need  to  be  transmitted,  these  are  quantized 
uniformly  on  a  log  scale  using  6  bits  and  a  resolution  of  1.3  dB.  Transmitting 
the  peak  values  tr*  too  frequently  leads  to  excessive  side  information.  In  our 
work,  updating  these  values  once  .-very  8  transform  coefficients  was  found  to 
be  a  suitable  compromise.  The  peak  values  are  quantized  using  a  simple  mean- 
difference  quantizer  at  2  bits/value.  Identifying  the  best,  maximal  vanishing 
moment  wavelet  requires  only  a  small  number  of  bits  per  frame.  The  overall  side- 
information  for  the  WT  based  coder  requires  approximately  0.3  bits/sample. 

3.D.5  Dynamic  Dictionary  Based  Encoding 

Although  the  wavelet  based  coding  method  is  able  to  exploit  the  masking 
characteristics,  further  reduction  in  the  bit  rate  is  possible  by  getting  rid  of 
the  statistical  redundancies.  One  possible  way  to  do  that  is  to  use  predictive 
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coding  (differential  PCM).  However,  this  method  has  generally  been  found  to 
be  unsuitable  for  maintaining  transparent  quality  across  a  wide  range  of  hi-fi 
music  signals.  Another  possibility  is  to  use  vector  quantization  (VQ)  [24].  It 
has  been  shown  that  VQ  coders  are  theoretically  superior  in  the  sense  that  the 
rate  distortion  bounds  on  the  data  rate  are  better  achievable  than  with  scalar 
quantization.  Still,  VQ  has  found  only  a  limited  role  in  hi-fi  audio  coding  because 
of  lingering  doubts  about  its  ability  to  ensure  transparent  quality.  Significant 
among  the  previous  attempts  towards  using  a  VQ  for  audio  compression  is  the 
work  by  Chan  and  Gersho  [12].  It  employs  a  fixed  multi-stage  tree  structured 
codebook  trained  on  DCT  coefficients. 

In  our  work,  in  an  attempt  to  guarantee  transparent  quality  for  all  type  of 
audio  signals,  we  use  a  dynamic  system  as  illustrated  in  Figure  1.  It  employs 
an  adaptive  dictionary  in  a  first  stage  of  VQ  for  the  audio  waveform  At  the 
same  time  the  difference  between  the  audio  waveform  and  the  chosen  dictionary- 
entry  is  also  encoded  using  the  wavelet  based  method  discussed  above.  Using 
this  second  coding  stage  allows  us  to  use  a  dictionary  of  relatively  small  size 
and  rather  simple  methods  for  dictionary  construction  and  update.  However, 
even  under  these  conditions  the  use  of  VQ  as  a  first  stage  results  in  significant 
coding  gain. 

In  Figure  20,  the  DWT  encoding  procedure  for  the  residual  remains  exactly 
the  same  as  the  one  for  the  signal  itself  (as  described  above).  Moreover,  per¬ 
ceptual  threshold  for  the  coding  of  the  difference  signal  is  also  the  same  as  the 
perceptual  threshold  of  the  signal  itself.  This  is  because  the  coding  errors  in  x 
are  exactly  equal  to  the  errors  in  the  quantization  of  residual,  d.  On  the  other 
hand,  the  dictionary  encoding  procedure  works  as  follows.  Each  frame  of  audio 
data  is  split  into  two  halves  (unless  the  frame  length  has  already  been  halved  for 
pre-echo  control).  For  either  of  the  two  half  frames  the  dictionary  is  searched 
for  the  best  perceptual  matches  using  a  procedure  described  below.  The  best 
entries  are  then  subtracted  from  the  respective  halves  of  the  signal  x  to  form 
the  residual  vector  d. 

Dynamic  Dictionary  Search 

Given  a  normal  frame  length  of  .V  samples,  the  above  discussion  indicates  that 
the  dictionary  is  used  to  encode  vectors  of  N/2  samples.  The  dictionary  entries, 
however,  are  maintained  as  “meta  vectors”  of  A’  samples  each.  To  encode  a 
signal  X  we  search  through  the  dictionary  to  find  an  element  fig  which  is  closest 
to  X  in  terms  of  a  “perceptual  distance  measure”.  This  distance  is  computed 
by  executing  the  following  three  steps 

1.  Compute  a  sliding  window  correlation  between  x  and  /i  and  find  the  lag 
Li  corresponding  to  the  peak  of  the  correlation  function  (0  <  i  <  N/2)^ 
Next,  form  a  vector  y^  consisting  of  N/2  samples  of  /i  starting  from  lag 
L,. 

2.  Estimate  the  time  warping  factor  to  normalize  the  time  scale  of  of  to 
that  of  X  (see  below). 
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3.  Compute  a  frequency  weighted  error  for  the  error  vector  e  =  x  —  y,, 
using  the  perceptual  threshold  computed  from  x.  This  error  power  is  the 
requisite  perceptual  distortion  measure. 


Step  1  and  2  above  ensure  a  better  perceptual  match  through  an  improved 
time  and  scale  alignment.  The  incorporation  of  these  steps  increases  the  effective 
size  of  the  dictionary  significantly  beyond  its  physical  dimension. 

Timt  Warp  Factor  Estimation 

In  the  dictionary  based  encoding  it  is  often  useful  to  renormalize  the  time  scale 
of  the  dictionary  entry  to  that  of  the  signal.  However,  in  the  audio  coder  an 
exact  characterization  of  a  warping  function  is  not  an  important  issue  (unlike  the 
speech/speaker  recognition  systems),  strict  residuals  are  also  being  transmitted 
simultaneously.  Thus,  it  may  be  assumed  that  there  is  a  constant  time-warp 
factor  for  the  entire  frame  of  audio  data.  In  [53],  we  show  that  a  warp  factor  o 
that  minimizes  /[z(t)  —  t/(a<)Pdt,  may  be  estimated  as 


Et  -  y(*)][y(<^  -f  I)  -  y(<^)] 


(3D.4) 


where  —0  5  <  7  <  0.5,  and  a  =  1  +  7. 

Dynamic  Dictionary  Update 

An  adaptive  dictionary  has  previously  been  used  in  some  applications,  e  g.,  for 
for  image  compression  [22].  The  dictionary  update  problem  in  the  proposed 
coder  is  somewhat  different  because  dictionary  entries  consist  of  waveforms  and 
the  vector  sizes  are  relatively  large  making  it  impractical  to  collect  several  sam¬ 
ples  before  executing  an  update  step.  Since  the  proposed  encoder  also  encodes 
the  residual,  we  have  so  far  been  able  to  work  with  a  simple  dictionary  update 
procedure  as  follows.  The  minimum  distance  measure  between  x  and  perceptu¬ 
ally  closest  entry  into  the  dictionary  is  compared  against  a  preselected  threshold. 
If  it  is  below  the  threshold  then  the  dictionary  remains  unchanged,  otherwise 
the  decoded  signal  x  is  used  to  update  the  dictionary  using  a  Longest-Time- 
Since-Use  strategy.  Several  improved  techniques  for  dictionary  update  in  the 
audio  coder  are  currently  under  investigation.  These  are  based  on  adaptive 
filtering/multigrid  adaptive  filtering  algorithms 


3.D.6  Adaptive  Framing  for  Pre-echo  Control 

It  was  noted  earlier  that  a  longer  analysis  frame  size  of  2048  samples  is  desir¬ 
able  for  lower  bit  rates.  But,  using  a  large  frame  size  implies  that  reconstruction 
errors  are  spread  over  the  effective  width  of  the  window  in  time.  The  backward 
masking  01  an  impulse  lasts  about  4  ms  and  is  not  sufficient  to  mask  the  error 
spectrum  for  the  full  duration  of  a  longer  frame.  This  leads  to  the  pre-echo  ef¬ 
fect  in  the  presence  of  a  sudden  increase  (impulse)  in  the  signal  energy.  To  limit 
pre-echos,  the  coder  switches  to  a  shorter  frame  size  (1024  samples)  whenever 
presence  of  strong  transients  is  detected  based  on  an  energy-  entropy  estimate 
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(computed  as  follows).  Each  frame  is  divided  into  segments  of  J  (J  =  16  is  a 
suitable  value  at  44.1KHz)  samples  each.  Signal  energy,  <t,,  is  computed  over 
each  of  these  segments  and  is  normalized  by  the  overall  frame  energy.  We  then 
define  energy-entropy  as 

/=-^crflogjO?.  (3.D.5) 

i 

This  entropy  measure  falls  sharply  for  segments  with  sudden  bursts  of  energy. 
An  entropy  value  of  4.5  bits  is  used  as  a  threshold  for  switching  the  frame  size. 

The  shorter  frame  eliminates  pre-echos  to  a  significant  extent.  Further  re¬ 
duction  in  pre-echos  is  accomplished  by  utilizing  the  time  varying  signal  energy' 
estimate  available  in  the  form  of  peak-values.  These  peak-values  are  used  to 
scale  the  WT  domain  masking  threshold  for  the  frames  in  which  presence  of 
sharp  transients  is  detected  based  on  the  above  entropy  criterion 

3.D.7  Experimental  Results 

The  proposed  audio  coder  was  tested  using  a  database  of  several  audio  samples 
(castanets,  piano,  pop,  drums,  clarinet,  etc  ).  For  each  sample  a  perceptually 
transparent  coding  (at  variable  bit  rates)  was  attempted.  The  bit  rales  in  the 
DWT  based  coder  were  found  to  be  in  the  range  of  0.8  —  1.2  bits /sample  for 
coding  the  transform  coefficients.  W'hen  a  dynamic  dictionary  of  150  entries  was 
employed,  the  coefficients  could  be  encoded  at  the  rates  of  0  6- 1 .0  bits/samples. 
Including  the  side  information  (about  0.3  bits/sample),  these  figures  correspond 
to  bit  rates  close  to  64  kb/s  or  below  for  the  DWT  based  method  alone  and  in 
the  range  of  48  —  64  kb/s  with  the  dictionary  encoding 

The  subjective  quality  in  “transparent"  coding  by  the  proposed  coder  was 
assessed  by  comparing  the  quality  of  transcoded  samples  with  that  of  MPEG 
layer-2  coding  [7]  of  the  same  samples.  In  listening  tests,  trained  listeners  were 
unable  to  detect  any  difference  between  the  original  and  the  transcoded  samples 
(of  the  proposed  coder).  In  the  judgement  of  some  listeners  the  quality  of  64 
kb/s  wavelet  coding  was  better  than  128  kb/s  MPEG  layer-2  coding. 

A  detailed  description  of  our  tests  and  test  results  may  be  found  in  [53]. 
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4  PERSONNEL 

•  Principal  Investigator 

1.  Prof.  A.  H.  Tewfik 

•  Graduate  Students 

1.  Deepen  Sinha 

2.  Hehong  Zou 
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8  INTERACTION  WITH  AIR  FORCE  LAB¬ 
ORATORIES 
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